Catalogue Search | MBRL

Trycycler: consensus long-read assemblies for bacterial genomes

by Méric, Guillaume , Holt, Kathryn E. , Hawkey, Jane in Animal Genetics and Genomics , automation , Bacterial genomics

2021

While long-read sequencing allows for the complete assembly of bacterial genomes, long-read assemblies contain a variety of errors. Here, we present Trycycler, a tool which produces a consensus assembly from multiple input assemblies of the same genome. Benchmarking showed that Trycycler assemblies contained fewer errors than assemblies constructed with a single tool. Post-assembly polishing further reduced errors and Trycycler+polishing assemblies were the most accurate genomes in our study. As Trycycler requires manual intervention, its output is not deterministic. However, we demonstrated that multiple users converge on similar assemblies that are consistently more accurate than those produced by automated assembly tools.

Journal Article

Share this book

Add to My Shelf

DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation

by Cheng, Albert , Rosikiewicz, Wojciech , Li, Sheng in 5-Methylcytosine - analysis , Accuracy , Animal Genetics and Genomics

2021

Background Nanopore long-read sequencing technology greatly expands the capacity of long-range, single-molecule DNA-modification detection. A growing number of analytical tools have been developed to detect DNA methylation from nanopore sequencing reads. Here, we assess the performance of different methylation-calling tools to provide a systematic evaluation to guide researchers performing human epigenome-wide studies. Results We compare seven analytic tools for detecting DNA methylation from nanopore long-read sequencing data generated from human natural DNA at a whole-genome scale. We evaluate the per-read and per-site performance of CpG methylation prediction across different genomic contexts, CpG site coverage, and computational resources consumed by each tool. The seven tools exhibit different performances across the evaluation criteria. We show that the methylation prediction at regions with discordant DNA methylation patterns, intergenic regions, low CG density regions, and repetitive regions show room for improvement across all tools. Furthermore, we demonstrate that 5hmC levels at least partly contribute to the discrepancy between bisulfite and nanopore sequencing. Lastly, we provide an online DNA methylation database ( https://nanome.jax.org ) to display the DNA methylation levels detected by nanopore sequencing and bisulfite sequencing data across different genomic contexts. Conclusions Our study is the first systematic benchmark of computational methods for detection of mammalian whole-genome DNA modifications in nanopore sequencing. We provide a broad foundation for cross-platform standardization and an evaluation of analytical tools designed for genome-scale modified base detection using nanopore sequencing.

Journal Article

Share this book

Add to My Shelf

Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing

by Peng, Hongke , Amarasinghe, Shanika L. , Su, Shian in Alternative Splicing , Animal Genetics and Genomics , Animals

2021

A modified Chromium 10x droplet-based protocol that subsamples cells for both short-read and long-read (nanopore) sequencing together with a new computational pipeline ( FLAMES ) is developed to enable isoform discovery, splicing analysis, and mutation detection in single cells. We identify thousands of unannotated isoforms and find conserved functional modules that are enriched for alternative transcript usage in different cell types and species, including ribosome biogenesis and mRNA splicing. Analysis at the transcript level allows data integration with scATAC-seq on individual promoters, improved correlation with protein expression data, and linked mutations known to confer drug resistance to transcriptome heterogeneity.

Journal Article

Share this book

Add to My Shelf

Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression

by Wang, Yahui , Zhou, Gangqiao , Ping, Jie in Adaptation , Adaptation, Physiological - genetics , Altitude

2021

Background Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. Results Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. Conclusions Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans’ adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.

Journal Article

Share this book

Add to My Shelf

Nanopore sequencing reveals endogenous NMD-targeted isoforms in human cells

by Zavolan, Mihaela , Karousis, Evangelos D. , Mühlemann, Oliver in 3' Untranslated regions , Alternative splicing , Animal Genetics and Genomics

2021

Background Nonsense-mediated mRNA decay (NMD) is a eukaryotic, translation-dependent degradation pathway that targets mRNAs with premature termination codons and also regulates the expression of some mRNAs that encode full-length proteins. Although many genes express NMD-sensitive transcripts, identifying them based on short-read sequencing data remains a challenge. Results To identify and analyze endogenous targets of NMD, we apply cDNA Nanopore sequencing and short-read sequencing to human cells with varying expression levels of NMD factors. Our approach detects full-length NMD substrates that are highly unstable and increase in levels or even only appear when NMD is inhibited. Among the many new NMD-targeted isoforms that our analysis identifies, most derive from alternative exon usage. The isoform-aware analysis reveals many genes with significant changes in splicing but no significant changes in overall expression levels upon NMD knockdown. NMD-sensitive mRNAs have more exons in the 3΄UTR and, for those mRNAs with a termination codon in the last exon, the length of the 3΄UTR per se does not correlate with NMD sensitivity. Analysis of splicing signals reveals isoforms where NMD has been co-opted in the regulation of gene expression, though the main function of NMD seems to be ridding the transcriptome of isoforms resulting from spurious splicing events. Conclusions Long-read sequencing enables the identification of many novel NMD-sensitive mRNAs and reveals both known and unexpected features concerning their biogenesis and their biological role. Our data provide a highly valuable resource of human NMD transcript targets for future genomic and transcriptomic applications.

Journal Article

Share this book

Add to My Shelf

A chromosome‐scale assembly of allotetraploid Brassica juncea (AABB) elucidates comparative architecture of the A and B genomes

by Pradhan, Akshay Kumar , Paritosh, Kumar , Bhayana, Latika in allotetraploidy , Asia , Assembly

2021

Summary Brassica juncea (AABB), commonly referred to as mustard, is a natural allopolyploid of two diploid species—B. rapa (AA) and B. nigra (BB). We report a highly contiguous genome assembly of an oleiferous type of B. juncea variety Varuna, an archetypical Indian gene pool line of mustard, with ~100× PacBio single‐molecule real‐time (SMRT) long reads providing contigs with an N50 value of >5 Mb. Contigs were corrected for the misassemblies and scaffolded with BioNano optical mapping. We also assembled a draft genome of B. nigra (BB) variety Sangam using Illumina short‐read sequencing and Oxford Nanopore long reads and used it to validate the assembly of the B genome of B. juncea. Two different linkage maps of B. juncea, containing a large number of genotyping‐by‐sequencing markers, were developed and used to anchor scaffolds/contigs to the 18 linkage groups of the species. The resulting chromosome‐scale assembly of B. juncea Varuna is a significant improvement over the previous draft assembly of B. juncea Tumida, a vegetable type of mustard. The assembled genome was characterized for transposons, centromeric repeats, gene content and gene block associations. In comparison to the A genome, the B genome contains a significantly higher content of LTR/Gypsy retrotransposons, distinct centromeric repeats and a large number of B. nigra specific gene clusters that break the gene collinearity between the A and the B genomes. The B. juncea Varuna assembly will be of major value to the breeding work on oleiferous types of mustard that are grown extensively in south Asia and elsewhere.

Journal Article

Share this book

Add to My Shelf

Extracellular vesicles, RNA sequencing, and bioinformatic analyses: Challenges, solutions, and recommendations

by Fullard, John F. , Heyliger, Simon O. , Saulsbury, Marilyn D. in Bioinformatics , Biomarkers , Biopsy

2024

Extracellular vesicles (EVs) are heterogeneous entities secreted by cells into their microenvironment and systemic circulation. Circulating EVs carry functional small RNAs and other molecular footprints from their cell of origin, and thus have evident applications in liquid biopsy, therapeutics, and intercellular communication. Yet, the complete transcriptomic landscape of EVs is poorly characterized due to critical limitations including variable protocols used for EV‐RNA extraction, quality control, cDNA library preparation, sequencing technologies, and bioinformatic analyses. Consequently, there is a gap in knowledge and the need for a standardized approach in delineating EV‐RNAs. Here, we address these gaps by describing the following points by (1) focusing on the large canopy of the EVs and particles (EVPs), which includes, but not limited to – exosomes and other large and small EVs, lipoproteins, exomeres/supermeres, mitochondrial‐derived vesicles, RNA binding proteins, and cell‐free DNA/RNA/proteins; (2) examining the potential functional roles and biogenesis of EVPs; (3) discussing various transcriptomic methods and technologies used in uncovering the cargoes of EVPs; (4) presenting a comprehensive list of RNA subtypes reported in EVPs; (5) describing different EV‐RNA databases and resources specific to EV‐RNA species; (6) reviewing established bioinformatics pipelines and novel strategies for reproducible EV transcriptomics analyses; (7) emphasizing the significant need for a gold standard approach in identifying EV‐RNAs across studies; (8) and finally, we highlight current challenges, discuss possible solutions, and present recommendations for robust and reproducible analyses of EVP‐associated small RNAs. Overall, we seek to provide clarity on the transcriptomics landscape, sequencing technologies, and bioinformatic analyses of EVP‐RNAs. Detailed portrayal of the current state of EVP transcriptomics will lead to a better understanding of how the RNA cargo of EVPs can be used in modern and targeted diagnostics and therapeutics. For the inclusion of different particles discussed in this article, we use the terms large/small EVs, non‐vesicular extracellular particles (NVEPs), EPs and EVPs as defined in MISEV guidelines by the International Society of Extracellular Vesicles (ISEV). Overview of the RNA landscape of EVPs. Most commonly studied small EVP subtypes ranging from a diameter of ~1 nm to > 200 nm (top), and the most common types of RNA found within EVPs ranging from miRNA, piRNA, tRNA, snRNA, YRNA, circRNA, snoRNA, lncRNAs, mRNA, and rRNA (bottom).

Journal Article

Share this book

Add to My Shelf

The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content

by Torkamaneh, Davoud , Belzile, François , Lemay, Marc‐André in biotechnology , Cultivation , de novo assembly

2021

Summary Studies on structural variation in plants have revealed the inadequacy of a single reference genome for an entire species and suggest that it is necessary to build a species‐representative genome called a pan‐genome to better capture the extent of both structural and nucleotide variation. Here, we present a pan‐genome of cultivated soybean (Glycine max), termed PanSoy, constructed using the de novo genome assembly of 204 phylogenetically and geographically representative improved accessions selected from the larger GmHapMap collection. PanSoy uncovers 108 Mb (˜11%) of novel nonreference sequences encompassing 3621 protein‐coding genes (including 1659 novel genes) absent from the soybean ‘Williams 82’ reference genome. Nonetheless, the core genome represents an exceptionally large proportion of the genome, with >90.6% of genes being shared by >99% of the accessions. A majority of PAVs encompassing genes could be confirmed with long‐read sequencing on a subset of accessions. The PanSoy is a major step towards capturing the extent of genetic variation in cultivated soybean and provides a resource for soybean genomics research and breeding.

Journal Article

Share this book

Add to My Shelf

A benchmark of structural variation detection by long reads through a realistic simulated model

by Dierckxsens, Nicolas , Xie, Zhi , Li, Tong in Accuracy , Algorithms , Animal Genetics and Genomics

2021

Accurate simulations of structural variation distributions and sequencing data are crucial for the development and benchmarking of new tools. We develop Sim-it, a straightforward tool for the simulation of both structural variation and long-read data. These simulations from Sim-it reveal the strengths and weaknesses for current available structural variation callers and long-read sequencing platforms. With these findings, we develop a new method (combiSV) that can combine the results from structural variation callers into a superior call set with increased recall and precision, which is also observed for the latest structural variation benchmark set developed by the GIAB Consortium.

Journal Article

Share this book

Add to My Shelf

A high‐quality Brassica napus genome reveals expansion of transposable elements, subgenome evolution and disease resistance

by Wang, Youping , Tu, Jinxing , Chen, Fei in allotetraploidy , Assembly , biotechnology

2021

Summary Rapeseed (Brassica napus L.) is a recent allotetraploid crop, which is well known for its high oil production. Here, we report a high‐quality genome assembly of a typical semi‐winter rapeseed cultivar, 'Zhongshuang11' (hereafter 'ZS11'), using a combination of single‐molecule sequencing and chromosome conformation capture (Hi‐C) techniques. Most of the high‐confidence sequences (93.1%) were anchored to the individual chromosomes with a total of 19 centromeres identified, matching the exact chromosome count of B. napus. The repeat sequences in the A and C subgenomes in B. napus expanded significantly from 500 000 years ago, especially over the last 100 000 years. These young and recently amplified LTR‐RTs showed dispersed chromosomal distribution but significantly preferentially clustered into centromeric regions. We exhaustively annotated the nucleotide‐binding leucine‐rich repeat (NLR) gene repertoire, yielding a total of 597 NLR genes in B. napus genome and 17.4% of which are paired (head‐to‐head arrangement). Based on the resequencing data of 991 B. napus accessions, we have identified 18 759 245 single nucleotide polymorphisms (SNPs) and detected a large number of genomic regions under selective sweep among the three major ecotype groups (winter, semi‐winter and spring) in B. napus. We found 49 NLR genes and five NLR gene pairs colocated in selective sweep regions with different ecotypes, suggesting a rapid diversification of NLR genes during the domestication of B. napus. The high quality of our B. napus 'ZS11' genome assembly could serve as an important resource for the study of rapeseed genomics and reveal the genetic variations associated with important agronomic traits.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter