Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
82 result(s) for "van der Hooft Justin J J"
Sort by:
Propagating annotations of molecular networks using in silico fragmentation
The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.
MS2DeepScore: a novel deep learning similarity measure to compare tandem mass spectra
Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model’s prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.
Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.
Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships
Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm—Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.
iPRESTO: Automated discovery of biosynthetic sub-clusters linked to specific natural product substructures
Microbial specialised metabolism is full of valuable natural products that are applied clinically, agriculturally, and industrially. The genes that encode their biosynthesis are often physically clustered on the genome in biosynthetic gene clusters (BGCs). Many BGCs consist of multiple groups of co-evolving genes called sub-clusters that are responsible for the biosynthesis of a specific chemical moiety in a natural product. Sub-clusters therefore provide an important link between the structures of a natural product and its BGC, which can be leveraged for predicting natural product structures from sequence, as well as for linking chemical structures and metabolomics-derived mass features to BGCs. While some initial computational methodologies have been devised for sub-cluster detection, current approaches are not scalable, have only been run on small and outdated datasets, or produce an impractically large number of possible sub-clusters to mine through. Here, we constructed a scalable method for unsupervised sub-cluster detection, called iPRESTO, based on topic modelling and statistical analysis of co-occurrence patterns of enzyme-coding protein families. iPRESTO was used to mine sub-clusters across 150,000 prokaryotic BGCs from antiSMASH-DB. After annotating a fraction of the resulting sub-cluster families, we could predict a substructure for 16% of the antiSMASH-DB BGCs. Additionally, our method was able to confirm 83% of the experimentally characterised sub-clusters in MIBiG reference BGCs. Based on iPRESTO-detected sub-clusters, we could correctly identify the BGCs for xenorhabdin and salbostatin biosynthesis (which had not yet been annotated in BGC databases), as well as propose a candidate BGC for akashin biosynthesis. Additionally, we show for a collection of 145 actinobacteria how substructures can aid in linking BGCs to molecules by correlating iPRESTO-detected sub-clusters to MS/MS-derived Mass2Motifs substructure patterns. This work paves the way for deeper functional and structural annotation of microbial BGCs by improved linking of orphan molecules to their cognate gene clusters, thus facilitating accelerated natural product discovery.
Veterinary trypanocidal benzoxaboroles are peptidase-activated prodrugs
Livestock diseases caused by Trypanosoma congolense , T . vivax and T . brucei , collectively known as nagana, are responsible for billions of dollars in lost food production annually. There is an urgent need for novel therapeutics. Encouragingly, promising antitrypanosomal benzoxaboroles are under veterinary development. Here, we show that the most efficacious subclass of these compounds are prodrugs activated by trypanosome serine carboxypeptidases (CBPs). Drug-resistance to a development candidate, AN11736, emerged readily in T . brucei , due to partial deletion within the locus containing three tandem copies of the CBP genes. T . congolense parasites, which possess a larger array of related CBPs , also developed resistance to AN11736 through deletion within the locus. A genome-scale screen in T . brucei confirmed CBP loss-of-function as the primary mechanism of resistance and CRISPR-Cas9 editing proved that partial deletion within the locus was sufficient to confer resistance. CBP re-expression in either T . brucei or T . congolense AN11736-resistant lines restored drug-susceptibility. CBPs act by cleaving the benzoxaborole AN11736 to a carboxylic acid derivative, revealing a prodrug activation mechanism. Loss of CBP activity results in massive reduction in net uptake of AN11736, indicating that entry is facilitated by the concentration gradient created by prodrug metabolism.
MS2Query: reliable and scalable MS2 mass spectra-based analogue search
Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology. The authors develop a machine learning approach to find structurally related chemicals in mass spectral libraries. Their method boosts the annotation rate and aids in assessing novelty in metabolomics datasets.
In Silico Optimization of Mass Spectrometry Fragmentation Strategies in Metabolomics
Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let researchers prototype, compare, and optimize strategies before validations on real machines. We introduce Virtual Metabolomics Mass Spectrometer (ViMMS), a metabolomics LC-MS/MS simulator framework that allows for scan-level control of the MS2 acquisition process in silico. ViMMS can generate new LC-MS/MS data based on empirical data or virtually re-run a previous LC-MS/MS analysis using pre-existing data to allow the testing of different fragmentation strategies. To demonstrate its utility, we show how ViMMS can be used to optimize N for Top-N data-dependent acquisition (DDA) acquisition, giving results comparable to modifying N on the mass spectrometer. We expect that ViMMS will save method development time by allowing for offline evaluation of novel fragmentation strategies and optimization of the fragmentation strategy for a particular experiment.
Bridging Ethnobotanical Knowledge and Multi-Omics Approaches for Plant-Derived Natural Product Discovery
For centuries, plant-derived natural products (NPs) have been fundamental to traditional medicine, providing essential therapeutic compounds. Ethnobotanical knowledge has historically guided NP discovery, leading to the identification of key pharmaceuticals such as aspirin, morphine, and artemisinin. However, conventional bioactivity-guided fractionation methods for NP isolation are labour-intensive and can result in the loss of bioactive properties due to the focus on a single compound. Advances in omics sciences—genomics, transcriptomics, proteomics, metabolomics, and phenomics—coupled with computational tools have altogether revolutionised NP research by enabling high-throughput screening and more precise compound identification. This review explores how integrating traditional medicinal knowledge with multi-omics strategies enhances NP discovery. We highlight emerging bioinformatics tools, mass spectrometry techniques, and metabologenomics approaches that accelerate the identification, annotation, and functional characterisation of plant-derived metabolites. Additionally, we discuss challenges in omics data integration and propose strategies to harness ethnobotanical knowledge for targeted NP discovery and drug development. By combining traditional wisdom with modern scientific advancements, this integrated approach paves the way for novel therapeutic discoveries and the sustainable utilisation of medicinal plants.
Combining Feature-Based Molecular Networking and Contextual Mass Spectral Libraries to Decipher Nutrimetabolomics Profiles
Untargeted metabolomics approaches deal with complex data hindering structural information for the comprehensive analysis of unknown metabolite features. We investigated the metabolite discovery capacity and the possible extension of the annotation coverage of the Feature-Based Molecular Networking (FBMN) approach by adding two novel nutritionally-relevant (contextual) mass spectral libraries to the existing public ones, as compared to widely-used open-source annotation protocols. Two contextual mass spectral libraries in positive and negative ionization mode of ~300 reference molecules relevant for plant-based nutrikinetic studies were created and made publicly available through the GNPS platform. The postprandial urinary metabolome analysis within the intervention of Vaccinium supplements was selected as a case study. Following the FBMN approach in combination with the added contextual mass spectral libraries, 67 berry-related and human endogenous metabolites were annotated, achieving a structural annotation coverage comparable to or higher than existing non-commercial annotation workflows. To further exploit the quantitative data obtained within the FBMN environment, the postprandial behavior of the annotated metabolites was analyzed with Pearson product-moment correlation. This simple chemometric tool linked several molecular families with phase II and phase I metabolism. The proposed approach is a powerful strategy to employ in longitudinal studies since it reduces the unknown chemical space by boosting the annotation power to characterize biochemically relevant metabolites in human biofluids.