Catalogue Search | MBRL

Dereplication of microbial metabolites through database search of mass spectra

by Pevzner, Pavel A. , Gurevich, Alexey , Cao, Liu in 119/118 , 631/114/2164 , 631/92/349

2018

Natural products have traditionally been rich sources for drug discovery. In order to clear the road toward the discovery of unknown natural products, biologists need dereplication strategies that identify known ones. Here we report DEREPLICATOR+, an algorithm that improves on the previous approaches for identifying peptidic natural products, and extends them for identification of polyketides, terpenes, benzenoids, alkaloids, flavonoids, and other classes of natural products. We show that DEREPLICATOR+ can search all spectra in the recently launched Global Natural Products Social molecular network and identify an order of magnitude more natural products than previous dereplication efforts. We further demonstrate that DEREPLICATOR+ enables cross-validation of genome-mining and peptidogenomics/glycogenomics results. New natural products can be identified via mass spectrometry by excluding all known ones from the analysis, a process called dereplication. Here, the authors extend a previously published dereplication algorithm to different classes of secondary metabolites.

Journal Article

Share this book

Add to My Shelf

The structure, function and evolution of a complete human chromosome 8

by Sorensen, Melanie , Mikheenko, Alla , Jain, Chirag in 13/106 , 14/19 , 14/32

2021

The complete assembly of each human chromosome is essential for understanding human biology and evolution 1 , 2 . Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence. The complete assembly of human chromosome 8 resolves previous gaps and reveals hidden complex forms of genetic variation, enabling functional and evolutionary characterization of primate centromeres.

Journal Article

Share this book

Add to My Shelf

Dereplication of peptidic natural products through database search of mass spectra

by Garg, Neha , Pevzner, Pavel A , Gurevich, Alexey in 631/114/2184 , 631/92/349 , 82/58

2017

Aggregated mass spectral data by consortia such as the Global Natural Products Social (GNPS) molecular networking infrastructure enable natural product discovery. DEREPLICATOR, validated on peptidic natural products, is a computational tool to identify known metabolites in complex samples. Peptidic natural products (PNPs) are widely used compounds that include many antibiotics and a variety of other bioactive peptides. Although recent breakthroughs in PNP discovery raised the challenge of developing new algorithms for their analysis, identification of PNPs via database search of tandem mass spectra remains an open problem. To address this problem, natural product researchers use dereplication strategies that identify known PNPs and lead to the discovery of new ones, even in cases when the reference spectra are not present in existing spectral libraries. DEREPLICATOR is a new dereplication algorithm that enables high-throughput PNP identification and that is compatible with large-scale mass-spectrometry-based screening platforms for natural product discovery. After searching nearly one hundred million tandem mass spectra in the Global Natural Products Social (GNPS) molecular networking infrastructure, DEREPLICATOR identified an order of magnitude more PNPs (and their new variants) than any previous dereplication efforts.

Journal Article

Share this book

Add to My Shelf

Single-nuclei isoform RNA sequencing unlocks barcoded exon connectivity in frozen brain tissue

by Jarvis, Erich , Trojanowski, John Q. , Gan, Li in 631/337/2019 , 631/378/340 , 631/61/212/2301

2022

Single-nuclei RNA sequencing characterizes cell types at the gene level. However, compared to single-cell approaches, many single-nuclei cDNAs are purely intronic, lack barcodes and hinder the study of isoforms. Here we present single-nuclei isoform RNA sequencing (SnISOr-Seq). Using microfluidics, PCR-based artifact removal, target enrichment and long-read sequencing, SnISOr-Seq increased barcoded, exon-spanning long reads 7.5-fold compared to naive long-read single-nuclei sequencing. We applied SnISOr-Seq to adult human frontal cortex and found that exons associated with autism exhibit coordinated and highly cell-type-specific inclusion. We found two distinct combination patterns: those distinguishing neural cell types, enriched in TSS-exon, exon-polyadenylation-site and non-adjacent exon pairs, and those with multiple configurations within one cell type, enriched in adjacent exon pairs. Finally, we observed that human-specific exons are almost as tightly coordinated as conserved exons, implying that coordination can be rapidly established during evolution. SnISOr-Seq enables cell-type-specific long-read isoform analysis in human brain and in any frozen or hard-to-dissociate sample. Complete RNA isoforms are captured by a new single-nuclei sequencing method.

Journal Article

Share this book

Add to My Shelf

Extending rnaSPAdes functionality for hybrid transcriptome assembly

by Antipov, Dmitry , Puglia, Giuseppe D. , Giordano, Daniela in Algorithms , Analysis , Assemblies

2020

Background De novo RNA-Seq assembly is a powerful method for analysing transcriptomes when the reference genome is not available or poorly annotated. However, due to the short length of Illumina reads it is usually impossible to reconstruct complete sequences of complex genes and alternative isoforms. Recently emerged possibility to generate long RNA reads, such as PacBio and Oxford Nanopores, may dramatically improve the assembly quality, and thus the consecutive analysis. While reference-based tools for analysing long RNA reads were recently developed, there is no established pipeline for de novo assembly of such data. Results In this work we present a novel method that allows to perform high-quality de novo transcriptome assemblies by combining accuracy and reliability of short reads with exon structure information carried out from long error-prone reads. The algorithm is designed by incorporating existing hybridSPAdes approach into rnaSPAdes pipeline and adapting it for transcriptomic data. Conclusion To evaluate the benefit of using long RNA reads we selected several datasets containing both Illumina and Iso-seq or Oxford Nanopore Technologies (ONT) reads. Using an existing quality assessment software, we show that hybrid assemblies performed with rnaSPAdes contain more full-length genes and alternative isoforms comparing to the case when only short-read data is used.

Journal Article

Share this book

Add to My Shelf

Dual-targeting CRISPR-CasRx reduces C9orf72 ALS/FTD sense and antisense repeat RNAs in vitro and in vivo

by de Oliveira, Paula , Hölbling, Benedikt V. , Lignani, Gabriele in 13/44 , 631/378/1689/1285 , 631/378/1689/364

2025

The most common genetic cause of frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) is an intronic G 4 C 2 repeat expansion in C9orf72 . The repeats undergo bidirectional transcription to produce sense and antisense repeat RNA species, which are translated into dipeptide repeat proteins (DPRs). As toxicity has been associated with both sense and antisense repeat-derived RNA and DPRs, targeting both strands may provide the most effective therapeutic strategy. CRISPR-Cas13 systems mature their own guide arrays, allowing targeting of multiple RNA species from a single construct. We show CRISPR-Cas13d variant CasRx effectively reduces overexpressed C9orf72 sense and antisense repeat transcripts and DPRs in HEK cells. In C9orf72 patient-derived iPSC-neuron lines, CRISPR-CasRx reduces endogenous sense and antisense repeat RNAs and DPRs and protects against glutamate-induced excitotoxicity. AAV delivery of CRISPR-CasRx to two distinct C9orf72 repeat mouse models significantly reduced both sense and antisense repeat-containing transcripts. This highlights the potential of RNA-targeting CRISPR systems as therapeutics for C9orf72 ALS/FTD. CRISPR-CasRx effectively reduces ALS- and FTD-causing C9orf72 sense and antisense repeat derived RNAs and proteins in cell lines, patient iPSC-neurons and two independent mouse models of C9orf72 repeat expansion.

Journal Article

Share this book

Add to My Shelf

Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra

by Pevzner, Pavel A. , Gurevich, Alexey , Mohimani, Hosein in 631/114 , 631/326/2522 , 631/61/320

2018

Peptidic natural products (PNPs) include many antibiotics and other bioactive compounds. While the recent launch of the Global Natural Products Social (GNPS) molecular networking infrastructure is transforming PNP discovery into a high-throughput technology, PNP identification algorithms are needed to realize the potential of the GNPS project. GNPS relies on the assumption that each connected component of a molecular network (representing related metabolites) illuminates the ‘dark matter of metabolomics’ as long as it contains a known metabolite present in a database. We reveal a surprising diversity of PNPs produced by related bacteria and show that, contrary to the ‘comparative metabolomics’ assumption, two related bacteria are unlikely to produce identical PNPs (even though they are likely to produce similar PNPs). Since this observation undermines the utility of GNPS, we developed a PNP identification tool, VarQuest, that illuminates the connected components in a molecular network even if they do not contain known PNPs and only contain their variants. VarQuest reveals an order of magnitude more PNP variants than all previous PNP discovery efforts and demonstrates that GNPS already contains spectra from 41% of the currently known PNP families. The enormous diversity of PNPs suggests that biosynthetic gene clusters in various microorganisms constantly evolve to generate a unique spectrum of PNP variants that differ from PNPs in other species. VarQuest—a method to search for new peptidic natural products (PNPs) based on existing mass spectra and chemical structure databases that incorporates potential modifications of known PNPs—identifies an order of magnitude more compounds than previous strategies.

Journal Article

Share this book

Add to My Shelf

NPvis: An Interactive Visualizer of Peptidic Natural Product–MS/MS Matches

by Kunyavskaya, Olga , Gurevich, Alexey , Mikheenko, Alla in Amino acids , Analysis , Annotations

2022

Peptidic natural products (PNPs) represent a medically important class of secondary metabolites that includes antibiotics, anti-inflammatory and antitumor agents. Advances in tandem mass spectra (MS/MS) acquisition and in silico database search methods have enabled high-throughput PNP discovery. However, the resulting spectra annotations are often error-prone and their validation remains a bottleneck. Here, we present NPvis, a visualizer suitable for the evaluation of PNP–MS/MS matches. The tool interactively maps annotated spectrum peaks to the corresponding PNP fragments and allows researchers to assess the match correctness. NPvis accounts for the wide chemical diversity of PNPs that prevents the use of the existing proteomics visualizers. Moreover, NPvis works even if the exact chemical structure of the matching PNP is unknown. The tool is available online and as a standalone application. We hope that it will benefit the community by streamlining PNP data analysis and validation.

Journal Article

Share this book

Add to My Shelf

Accurate isoform discovery with IsoQuant using long reads

by Joglekar, Anoushka , Mikheenko, Alla , Tilgner, Hagen U. in 631/114/2785 , 631/114/794 , Agriculture

2023

Annotating newly sequenced genomes and determining alternative isoforms from long-read RNA data are complex and incompletely solved problems. Here we present IsoQuant—a computational tool using intron graphs that accurately reconstructs transcripts both with and without reference genome annotation. For novel transcript discovery, IsoQuant reduces the false-positive rate fivefold and 2.5-fold for Oxford Nanopore reference-based or reference-free mode, respectively. IsoQuant also improves performance for Pacific Biosciences data. IsoQuant predicts novel isoforms from long-read RNA sequencing.

Journal Article

Share this book

Add to My Shelf

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

by Sović, Ivan , Koren, Sergey , Wood, Jonathan M. D. in 631/114/2785 , 631/1647/794 , 631/208/212/2302

2022

Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k -mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies. The work describes the validation and polishing strategies developed by the telomere-to-telomere consortium for evaluating and improving the first complete human genome assembly.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter