Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
28 result(s) for "Marijon, Pierre"
Sort by:
Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads
Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing 1 , 2 with continuous long-read or high-fidelity 3 sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms. Assembly of haplotype-resolved human genomes is achieved by combining short and long reads.
A draft human pangenome reference
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals 1 . These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample. An initial draft of the human pangenome is presented and made publicly available by the Human Pangenome Reference Consortium; the draft contains 94 de novo haplotype assemblies from 47 ancestrally diverse individuals.
Increased mutation and gene conversion within human segmental duplications
Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data 1 , 2 . Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions 3 , 4 . We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have ‘relocated’ on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences 5 , 6 . A study comparing the pattern of single-nucleotide variation between unique and duplicated regions of the human genome shows that mutation rate and interlocus gene conversion are elevated in duplicated regions.
Cutevariant: a standalone GUI-based desktop application to explore genetic variations from an annotated VCF file
Summary Cutevariant is a graphical user interface (GUI)-based desktop application designed to filter variations from annotated VCF file. The application imports data into a local SQLite database where complex filter queries can be built either from GUI controllers or using a domain-specific language called Variant Query Language. Cutevariant provides more features than existing applications and is fully customizable thanks to a complete plugins architecture. Availability and implementation Cutevariant is distributed as a multiplatform client-side software under an open source license and is available at https://github.com/labsquare/cutevariant.
Cutevariant: a GUI-based desktop application to explore genetics variations
Abstract Cutevariant is a user-friendly GUI based desktop application for genomic research designed to search for variations in DNA samples collected in annotated files and encoded in the Variant Calling Format. The application imports data into a local relational database wherefrom complex filter-queries can be built either from the intuitive GUI or using a Domain Specific Language (DSL). Cutevariant provides more features than any existing applications without compromising on performance. The plugin based architecture provides highly customizable features. Cutevariant is distributed as a multiplatform client-side software under an open source licence and is available at https://github.com/labsquare/Cutevariant. It has been designed from the beginning to be easily adopted by IT-agnostic end-users. Competing Interest Statement The authors have declared no competing interest. Footnotes * https://labsquare.github.io/cutevariant/
yacrd and fpa: upstream tools for long-read genome assembly
Motivation: Genome assembly is increasingly performed on long, uncorrected reads. Assembly quality may be degraded due to unfiltered chimeric reads; also, the storage of all read overlaps can take up to terabytes of disk space. Results: We introduce two tools, yacrd and fpa, preform respectively chimera removal , read scrubbing, and filter out spurious overlaps. We show that yacrd results in higher-quality assemblies and is one hundred times faster than the best available alternative. Availability: https://github.com/natir/yacrd and https://github.com/natir/fpa Footnotes * Important revision, we add many datasets (more than 60) in result and two figures to present how our tools work. * https://github.com/natir/yacrd-and-fpa-upstream-tools-for-lr-genome-assembly
Critical Assessment of Metagenome Interpretation - the second round of challenges
Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the community-driven initiative for the Critical Assessment of Metagenome Interpretation (CAMI). In its second challenge, CAMI engaged the community to assess their methods on realistic and complex metagenomic datasets with long and short reads, created from ~1,700 novel and known microbial genomes, as well as ~600 novel plasmids and viruses. Altogether 5,002 results by 76 program versions were analyzed, representing a 22x increase in results. Substantial improvements were seen in metagenome assembly, some due to using long-read data. The presence of related strains still was challenging for assembly and genome binning, as was assembly quality for the latter. Taxon profilers demonstrated a marked maturation, with taxon profilers and binners excelling at higher bacterial taxonomic ranks, but underperforming for viruses and archaea. Assessment of clinical pathogen detection techniques revealed a need to improve reproducibility. Analysis of program runtimes and memory usage identified highly efficient programs, including some top performers with other metrics. The CAMI II results identify current challenges, but also guide researchers in selecting methods for specific analyses. Competing Interest Statement A.E.D. co-founded Longas Technologies Pty Ltd, a company aimed at development of synthetic long-read sequencing technologies. Footnotes * http://www.cami-challenge.org/
Heart Rate and Risk of Cancer Death in Healthy Men
Data from several previous studies examining heart-rate and cardiovascular risk have hinted at a possible relationship between heart-rate and non-cardiac mortality. We thus systematically examined the predictive value of heart-rate variables on the subsequent risk of death from cancer. In the Paris Prospective Study I, 6101 asymptomatic French working men aged 42 to 53 years, free of clinically detectable cardiovascular disease and cancer, underwent a standardized graded exercise test between 1967 and 1972. Resting heart-rate, heart-rate increase during exercise, and decrease during recovery were measured. Change in resting heart-rate over 5 years was also available in 5139 men. Mortality including 758 cancer deaths was assessed over the 25 years of follow-up. There were strong, graded and significant relationships between all heart-rate parameters and subsequent cancer deaths. After adjustment for age and tobacco consumption and, compared with the lowest quartile, those with the highest quartile for resting heart-rate had a relative risk of 2.4 for cancer deaths (95% confidence interval: 1.9-2.9, p<0.0001) This was similar after adjustment for traditional cardiovascular risk factors and was observed for the commonest malignancies (respiratory and gastrointestinal). Similarly, significant relationships with cancer death were observed between poor heart rate increase during exercise, poor decrease during recovery and greater heart-rate increase over time (p<0.0001 for all). Resting and exercise heart rate had consistent, graded and highly significant associations with subsequent cancer mortality in men.