Catalogue Search | MBRL

Mass Recalibration of FT-ICR Mass Spectrometry Imaging Data Using the Average Frequency Shift of Ambient Ions

by Barry, Jeremy A. , Muddiman, David C. , Robichaud, Guillaume in Analytical Chemistry , Animals , Bioinformatics

2013

Achieving and maintaining high mass measurement accuracy (MMA) throughout a mass spectrometry imaging (MSI) experiment is vital to the identification of the observed ions. However, when using FTMS instruments, fluctuations in the total ion abundance at each pixel due to inherent biological variation in the tissue section can introduce space charge effects that systematically shift the observed mass. Herein we apply a recalibration based on the observed cyclotron frequency shift of ions found in the ambient laboratory environment, polydimethylcyclosiloxanes (PDMS). This calibration method is capable of achieving part per billion (ppb) mass accuracy with relatively high precision for an infrared matrix-assisted laser desorption electrospray ionization (IR-MALDESI) MSI dataset. Comparisons with previously published mass calibration approaches are also presented. Figure ᅟ

Journal Article

Share this book

Add to My Shelf

Proof of concept for identifying cystic fibrosis from perspiration samples

by Zare, Richard N. , Milla, Carlos , Alvarez, Daniel in Algorithms , Biological Sciences , Case-Control Studies

2019

The gold standard for cystic fibrosis (CF) diagnosis is the determination of chloride concentration in sweat. Current testing methodology takes up to 3 h to complete and has recognized shortcomings on its diagnostic accuracy. We present an alternative method for the identification of CF by combining desorption electrospray ionization mass spectrometry and a machine-learning algorithm based on gradient boosted decision trees to analyze perspiration samples. This process takes as little as 2 min, and we determined its accuracy to be 98 ± 2% by cross-validation on analyzing 277 perspiration samples. With the introduction of statistical bootstrap, our method can provide a confidence estimate of our prediction, which helps diagnosis decision-making. We also identified important peaks by the feature selection algorithm and assigned the chemical structure of the metabolites by high-resolution and/or tandem mass spectrometry. We inspected the correlation between mild and severe CFTR gene mutation types and lipid profiles, suggesting a possible way to realize personalized medicine with this noninvasive, fast, and accurate method.

Journal Article

Share this book

Add to My Shelf

Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking

by Keyzers, Robert A , Sims, Amy C , Larson, Charles B in 101/58 , 631/114/2398 , 639/638/11/296

2016

GNPS is an open-access community-curated analysis platform for sharing natural product mass spectrometry data that enables continuous, automatic reanalysis of deposited 'living' data sets. The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu ), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.

Journal Article

Share this book

Add to My Shelf

Propagating annotations of molecular networks using in silico fragmentation

by van der Hooft, Justin J. J. , Balunas, Marcy J. , Lopes, Norberto Peporine in Animals , Annotations , Ants - microbiology

2018

The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

Journal Article

Share this book

Add to My Shelf

TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets

by Devabhaktuni, Arun , Olsson, Niclas , Pearlman, Samuel M. in 631/114/2784 , 631/80/458 , Agriculture

2019

Although mass spectrometry is well suited to identifying thousands of potential protein post-translational modifications (PTMs), it has historically been biased towards just a few. To measure the entire set of PTMs across diverse proteomes, software must overcome the dual challenges of covering enormous search spaces and distinguishing correct from incorrect spectrum interpretations. Here, we describe TagGraph, a computational tool that overcomes both challenges with an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model that we optimized for PTM assignments. We applied TagGraph to a published human proteomic dataset of 25 million mass spectra and tripled confident spectrum identifications compared to its original analysis. We identified thousands of modification types on almost 1 million sites in the proteome. We show alternative contexts for highly abundant yet understudied PTMs such as proline hydroxylation, and its unexpected association with cancer mutations. By enabling broad characterization of PTMs, TagGraph informs as to how their functions and regulation intersect. A string-based computational tool enables swift, robust identification of post-translational modifications in MS/MS datasets.

Journal Article

Share this book

Add to My Shelf

Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study

by Hanhineva, Kati , Kokla, Marietta , Kolehmainen, Marjukka in Algorithms , Bias , Bioinformatics

2019

Background LC-MS technology makes it possible to measure the relative abundance of numerous molecular features of a sample in single analysis. However, especially non-targeted metabolite profiling approaches generate vast arrays of data that are prone to aberrations such as missing values. No matter the reason for the missing values in the data, coherent and complete data matrix is always a pre-requisite for accurate and reliable statistical analysis. Therefore, there is a need for proper imputation strategies that account for the missingness and reduce the bias in the statistical analysis. Results Here we present our results after evaluating nine imputation methods in four different percentages of missing values of different origin. The performance of each imputation method was analyzed by Normalized Root Mean Squared Error (NRMSE). We demonstrated that random forest (RF) had the lowest NRMSE in the estimation of missing values for Missing at Random (MAR) and Missing Completely at Random (MCAR). In case of absent values due to Missing Not at Random (MNAR), the left truncated data was best imputed with minimum value imputation. We also tested the different imputation methods for datasets containing missing data of various origin, and RF was the most accurate method in all cases. The results were obtained by repeating the evaluation process 100 times with the use of metabolomics datasets where the missing values were introduced to represent absent data of different origin. Conclusion Type and rate of missingness affects the performance and suitability of imputation methods. RF-based imputation method performs best in most of the tested scenarios, including combinations of different types and rates of missingness. Therefore, we recommend using random forest-based imputation for imputing missing metabolomics data, and especially in situations where the types of missingness are not known in advance.

Journal Article

Share this book

Add to My Shelf

Normalization and missing value imputation for label-free LC-MS analysis

by Karpievitch, Yuliya V , Smith, Richard D , Dabney, Alan R in Algorithms , Bias , Bioinformatics

2012

Shotgun proteomic data are affected by a variety of known and unknown systematic biases as well as high proportions of missing values. Typically, normalization is performed in an attempt to remove systematic biases from the data before statistical inference, sometimes followed by missing value imputation to obtain a complete matrix of intensities. Here we discuss several approaches to normalization and dealing with missing values, some initially developed for microarray data and some developed specifically for mass spectrometry-based data.

Journal Article

Share this book

Add to My Shelf

Cumulative learning enables convolutional neural network representations for small mass spectrometry data classification

by Fournier, Isabelle , Saudemont, Philippe , Wisztorski, Maxence in 101/58 , 631/114/1305 , 631/114/1314

2020

Rapid and accurate clinical diagnosis remains challenging. A component of diagnosis tool development is the design of effective classification models with Mass spectrometry (MS) data. Some Machine Learning approaches have been investigated but these models require time-consuming preprocessing steps to remove artifacts, making them unsuitable for rapid analysis. Convolutional Neural Networks (CNNs) have been found to perform well under such circumstances since they can learn representations from raw data. However, their effectiveness decreases when the number of available training samples is small, which is a common situation in medicine. In this work, we investigate transfer learning on 1D-CNNs, then we develop a cumulative learning method when transfer learning is not powerful enough. We propose to train the same model through several classification tasks over various small datasets to accumulate knowledge in the resulting representation. By using rat brain as the initial training dataset, a cumulative learning approach can have a classification accuracy exceeding 98% for 1D clinical MS-data. We show the use of cumulative learning using datasets generated in different biological contexts, on different organisms, and acquired by different instruments. Here we show a promising strategy for improving MS data classification accuracy when only small numbers of samples are available. Convolutional Neural Networks are powerful tools for clinical diagnosis but their effectiveness decreases when the number of available samples is small. Here, the authors develop a cumulative learning method by training the same model through several classification tasks over various small Mass Spectrometry datasets.

Journal Article

Share this book

Add to My Shelf

mProphet: automated data processing and statistical validation for large-scale SRM experiments

by Hengartner, Michael O , Aebersold, Ruedi , Picotti, Paola in 631/1647/527/296 , 631/92/475 , Algorithms

2011

mProphet, a computational tool for statistically validating selected reaction monitoring (SRM) mass spectrometry data, is described. Selected reaction monitoring (SRM) is a targeted mass spectrometric method that is increasingly used in proteomics for the detection and quantification of sets of preselected proteins at high sensitivity, reproducibility and accuracy. Currently, data from SRM measurements are mostly evaluated subjectively by manual inspection on the basis of ad hoc criteria, precluding the consistent analysis of different data sets and an objective assessment of their error rates. Here we present mProphet, a fully automated system that computes accurate error rates for the identification of targeted peptides in SRM data sets and maximizes specificity and sensitivity by combining relevant features in the data into a statistical model.

Journal Article

Share this book

Add to My Shelf

Quantitative mass spectrometry in proteomics: a critical review

by Bantscheff, Marcus , Rick, Jens , Schirle, Markus in analysis , Automatic Data Processing , chemistry

2007

The quantification of differences between two or more physiological states of a biological system is among the most important but also most challenging technical tasks in proteomics. In addition to the classical methods of differential protein gel or blot staining by dyes and fluorophores, mass-spectrometry-based quantification methods have gained increasing popularity over the past five years. Most of these methods employ differential stable isotope labeling to create a specific mass tag that can be recognized by a mass spectrometer and at the same time provide the basis for quantification. These mass tags can be introduced into proteins or peptides (i) metabolically, (ii) by chemical means, (iii) enzymatically, or (iv) provided by spiked synthetic peptide standards. In contrast, label-free quantification approaches aim to correlate the mass spectrometric signal of intact proteolytic peptides or the number of peptide sequencing events with the relative or absolute protein quantity directly. In this review, we critically examine the more commonly used quantitative mass spectrometry methods for their individual merits and discuss challenges in arriving at meaningful interpretations of quantitative proteomic data. [graphic removed]

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter