Catalogue Search | MBRL

Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis

by Chen, Chen , Cheng, Jianlin , Hou, Jie in Accuracy , Algorithms , Bioinformatics

2020

Recent advances in mass spectrometry (MS)-based proteomics have enabled tremendous progress in the understanding of cellular mechanisms, disease progression, and the relationship between genotype and phenotype. Though many popular bioinformatics methods in proteomics are derived from other omics studies, novel analysis strategies are required to deal with the unique characteristics of proteomics data. In this review, we discuss the current developments in the bioinformatics methods used in proteomics and how they facilitate the mechanistic understanding of biological processes. We first introduce bioinformatics software and tools designed for mass spectrometry-based protein identification and quantification, and then we review the different statistical and machine learning methods that have been developed to perform comprehensive analysis in proteomics studies. We conclude with a discussion of how quantitative protein data can be used to reconstruct protein interactions and signaling networks.

Journal Article

Share this book

Add to My Shelf

mspms: an R package and GUI for multiplex substrate profiling by mass spectrometry

by Gonzalez, David J. , O’Donoghue, Anthony , Bayne, Charlie in Accessibility , Algorithms , Bioinformatics

2026

Background Multiplex Substrate Profiling by Mass Spectrometry (MSP-MS) is a powerful method for determining the substrate specificity of proteolytic enzymes, which is essential for developing protease inhibitors, diagnostics, and protease-activated therapeutics. However, the complex datasets generated by MSP-MS pose significant analytical challenges and have limited accessibility for non-specialist users. Results We developed mspms , a Bioconductor R package with an accompanying graphical interface, to streamline the analysis of MSP-MS data. Mspms standardizes workflows for data preparation, processing, statistical analysis, and visualization. The tool is designed for accessibility, serving advanced users through the R package and broader audiences through a web-based interface. We validated mspms using data from four well-characterized cathepsins (A–D), demonstrating that it reliably captures expected substrate specificities. Conclusions mspms is the first publicly available, comprehensive platform for MSP-MS data analysis downstream of peptide identification and quantification. It integrates preprocessing, normalization, statistical testing, and visualization into a single, transparent, and user-friendly framework, making it a valuable resource for the protease research community. The package is distributed via Bioconductor, and a graphical interface is available online for interactive use.

Journal Article

Share this book

Add to My Shelf

Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong

by Keich, Uri , Gupta, Nitin , Pevzner, Pavel A. in Algorithms , Analytical Chemistry , Bioinformatics

2011

The target-decoy approach (TDA) has done the field of proteomics a great service by filling in the need to estimate the false discovery rates (FDR) of peptide identifications. While TDA is often viewed as a universal solution to the problem of FDR evaluation, we argue that the time has come to critically re-examine TDA and to acknowledge not only its merits but also its demerits. We demonstrate that some popular MS/MS search tools are not TDA-compliant and that it is easy to develop a non-TDA compliant tool that outperforms all TDA-compliant tools. Since the distinction between TDA-compliant and non-TDA compliant tools remains elusive, we are concerned about a possible proliferation of non-TDA-compliant tools in the future (developed with the best intentions). We are also concerned that estimation of the FDR by TDA awkwardly depends on a virtual coin toss and argue that it is important to take the coin toss factor out of our estimation of the FDR. Since computing FDR via TDA suffers from various restrictions, we argue that TDA is not needed when accurate p-values of individual Peptide-Spectrum Matches are available.

Journal Article

Share this book

Add to My Shelf

A survey of current trends in computational predictions of protein-protein interactions

by WANG, Yanbin , LI, Liping , CHEN, Zhanheng in computational proteomics , Computer Science , Genomes

2020

Proteomics become an important research area of interests in life science after the completion of the human genome project. This scientific is to study the characteristics of proteins at the large-scale data level, and then gain a holistic and comprehensive understanding of the process of disease occurrence and cell metabolism at the protein level. A key issue in proteomics is how to efficiently analyze the massive amounts of protein data produced by high-throughput technologies. Computational technologies with low-cost and short-cycle are becoming the preferred methods for solving some important problems in post-genome era, such as protein-protein interactions (PPIs). In this review, we focus on computational methods for PPIs detection and show recent advancements in this critical area from multiple aspects. First, we analyze in detail the several challenges for computational methods for predicting PPIs and summarize the available PPIs data sources. Second, we describe the stateof-the-art computational methods recently proposed on this topic. Finally, we discuss some important technologies that can promote the prediction of PPI and the development of computational proteomics.

Journal Article

Share this book

Add to My Shelf

Geena 2, improved automated analysis of MALDI/TOF mass spectra

by Rocco, Mattia , Ferri, Fabio , Romano, Paolo in Algorithms , Analysis , Automation

2016

Background Mass spectrometry (MS) is producing high volumes of data supporting oncological sciences, especially for translational research. Most of related elaborations can be carried out by combining existing tools at different levels, but little is currently available for the automation of the fundamental steps. For the analysis of MALDI/TOF spectra, a number of pre-processing steps are required, including joining of isotopic abundances for a given molecular species, normalization of signals against an internal standard, background noise removal, averaging multiple spectra from the same sample, and aligning spectra from different samples. In this paper, we present Geena 2, a public software tool for the automated execution of these pre-processing steps for MALDI/TOF spectra. Results Geena 2 has been developed in a Linux-Apache-MySQL-PHP web development environment, with scripts in PHP and Perl. Input and output are managed as simple formats that can be consumed by any database system and spreadsheet software. Input data may also be stored in a MySQL database. Processing methods are based on original heuristic algorithms which are introduced in the paper. Three simple and intuitive web interfaces are available: the Standard Search Interface, which allows a complete control over all parameters, the Bright Search Interface, which leaves to the user the possibility to tune parameters for alignment of spectra, and the Quick Search Interface, which limits the number of parameters to a minimum by using default values for the majority of parameters. Geena 2 has been utilized, in conjunction with a statistical analysis tool, in three published experimental works: a proteomic study on the effects of long-term cryopreservation on the low molecular weight fraction of serum proteome, and two retrospective serum proteomic studies, one on the risk of developing breat cancer in patients affected by gross cystic disease of the breast (GCDB) and the other for the identification of a predictor of breast cancer mortality following breast cancer surgery, whose results were validated by ELISA, a completely alternative method. Conclusions Geena 2 is a public tool for the automated pre-processing of MS data originated by MALDI/TOF instruments, with a simple and intuitive web interface. It is now under active development for the inclusion of further filtering options and for the adoption of standard formats for MS spectra.

Journal Article

Share this book

Add to My Shelf

Detecting and Removing Data Artifacts in Hadamard Transform Ion Mobility-Mass Spectrometry Measurements

by Ibrahim, Yehia M. , Monroe, Matthew E. , Prost, Spencer A. in Algorithms , Analytical Chemistry , BASIC BIOLOGICAL SCIENCES

2014

Applying Hadamard transform multiplexing to ion mobility separations (IMS) can significantly improve the signal-to-noise ratio and throughput for IMS coupled mass spectrometry (MS) measurements by increasing the ion utilization efficiency. However, it has been determined that fluctuations in ion intensity as well as spatial shifts in the multiplexed data lower the signal-to-noise ratios and appear as noise in downstream processing of the data. To address this problem, we have developed a novel algorithm that discovers and eliminates data artifacts. The algorithm employs an analytical approach to identify and remove artifacts from the data, decreasing the likelihood of false identifications in subsequent data processing. Following application of the algorithm, IMS-MS measurement sensitivity is greatly increased and artifacts that previously limited the utility of applying the Hadamard transform to IMS are avoided. Figure ᅟ

Journal Article

Share this book

Add to My Shelf

Interactive exploration of ligand transportation through protein tunnels

by Furmanová, Katarína , Byška, Jan , Jurčík, Adam in Active sites (Biochemistry) , Algorithms , Amino acids

2017

Background Protein structures and their interaction with ligands have been in the focus of biochemistry and structural biology research for decades. The transportation of ligand into the protein active site is often complex process, driven by geometric and physico-chemical properties, which renders the ligand path full of jitter and impasses. This prevents understanding of the ligand transportation and reasoning behind its behavior along the path. Results To address the needs of the domain experts we design an explorative visualization solution based on a multi-scale simplification model. It helps to navigate the user to the most interesting parts of the ligand trajectory by exploring different attributes of the ligand and its movement, such as its distance to the active site, changes of amino acids lining the ligand, or ligand “stuckness”. The process is supported by three linked views – 3D representation of the simplified trajectory, scatterplot matrix, and bar charts with line representation of ligand-lining amino acids. Conclusions The usage of our tool is demonstrated on molecular dynamics simulations provided by the domain experts. The tool was tested by the domain experts from protein engineering and the results confirm that it helps to navigate the user to the most interesting parts of the ligand trajectory and to understand the ligand behavior.

Journal Article

Share this book

Add to My Shelf

Evolutionary Dynamics of Indels in SARS-CoV-2 Spike Glycoprotein

by Su, Lingtao , Fornelli, Luca , Ahsan, Nagib in Amino acids , Coronaviruses , COVID-19

2021

SARS-CoV-2, responsible for the current COVID-19 pandemic that claimed over 5.0 million lives, belongs to a class of enveloped viruses that undergo quick evolutionary adjustments under selection pressure. Numerous variants have emerged in SARS-CoV-2, posing a serious challenge to the global vaccination effort and COVID-19 management. The evolutionary dynamics of this virus are only beginning to be explored. In this work, we have analysed 1.79 million spike glycoprotein sequences of SARS-CoV-2 and found that the virus is fine-tuning the spike with numerous amino acid insertions and deletions (indels). Indels seem to have a selective advantage as the proportions of sequences with indels steadily increased over time, currently at over 89%, with similar trends across countries/variants. There were as many as 420 unique indel positions and 447 unique combinations of indels. Despite their high frequency, indels resulted in only minimal alteration of N-glycosylation sites, including both gain and loss. As indels and point mutations are positively correlated and sequences with indels have significantly more point mutations, they have implications in the evolutionary dynamics of the SARS-CoV-2 spike glycoprotein.

Journal Article

Share this book

Add to My Shelf

Transitioning from Targeted to Comprehensive Mass Spectrometry Using Genetic Algorithms

by Jaffe, Jacob D. , Feeney, Caitlin M. , Patel, Jinal in Analytical Chemistry , Assaying , Bioinformatics

2016

Targeted proteomic assays are becoming increasingly popular because of their robust quantitative applications enabled by internal standardization, and they can be routinely executed on high performance mass spectrometry instrumentation. However, these assays are typically limited to 100s of analytes per experiment. Considerable time and effort are often expended in obtaining and preparing samples prior to targeted analyses. It would be highly desirable to detect and quantify 1000s of analytes in such samples using comprehensive mass spectrometry techniques (e.g., SWATH and DIA) while retaining a high degree of quantitative rigor for analytes with matched internal standards. Experimentally, it is facile to port a targeted assay to a comprehensive data acquisition technique. However, data analysis challenges arise from this strategy concerning agreement of results from the targeted and comprehensive approaches. Here, we present the use of genetic algorithms to overcome these challenges in order to configure hybrid targeted/comprehensive MS assays. The genetic algorithms are used to select precursor-to-fragment transitions that maximize the agreement in quantification between the targeted and the comprehensive methods. We find that the algorithm we used provided across-the-board improvement in the quantitative agreement between the targeted assay data and the hybrid comprehensive/targeted assay that we developed, as measured by parameters of linear models fitted to the results. We also found that the algorithm could perform at least as well as an independently-trained mass spectrometrist in accomplishing this task. We hope that this approach will be a useful tool in the development of quantitative approaches for comprehensive proteomics techniques. Graphical Abstract ᅟ

Journal Article

Share this book

Add to My Shelf

XLPM: efficient algorithm for the analysis of protein-protein contacts using chemical cross-linking mass spectrometry

by Hall, Roger , Bauer, Michael A , Raney, Kevin D in Algorithms , Amino acids , Bioinformatics

2014

Background Chemical cross-linking is used for protein-protein contacts mapping and for structural analysis. One of the difficulties in cross-linking studies is the analysis of mass-spectrometry data and the assignment of the site of cross-link incorporation. The difficulties are due to higher charges of fragment ions, and to the overall low-abundance of cross-link species in the background of linear peptides. Cross-linkers non-specific at one end, such as photo-inducible diazirines, may complicate the analysis further. In this report, we design and validate a novel cross-linked peptide mapping algorithm (XLPM) and compare it to StavroX, which is currently one of the best algorithms in this class. Results We have designed a novel cross-link search algorithm -XLPM - and implemented it both as an online tool and as a downloadable archive of scripts. We designed a filter based on an observation that observation of a b-ion implies observation of a complimentary y-ion with high probability (b-y filter). We validated the b-y filter on the set of linear peptides from NIST library, and demonstrate that it is an effective way to find high-quality mass spectra. Next, we generated cross-linked data from an ssDNA binding protein, Rim1with a specific cross-linker disuccinimidyl suberate, and a semi-specific cross-linker NHS-Diazirine, followed by analysis of the cross-linked products by nanoLC-LTQ-Orbitrap mass spectrometry. The cross-linked data were searched by XLPM and StavroX and the performance of the two algorithms was compared. The cross-links were mapped to the X-ray structure of Rim1 tetramer. Analysis of the mixture of NHS-Diazirine cross-linked 15 N and 14 N-labeled Rim1 tetramers yielded 15 N-labeled to 14 N-labeled cross-linked peptide pairs, corresponding to C-terminus-to-N-terminus cross-linking, demonstrating interaction between different two Rim1 tetramers. Both XLPM and StavroX were successful in identification of this interaction, with XLPM leading to a better annotation of higher-charged fragments. We also put forward a new method of estimating specificity and sensitivity of identification of a cross-linked residue in the case of a non-specific cross-linker. Conclusions The novel cross-link mapping algorithm, XLPM, considerably improves the speed and accuracy of the analysis compared to other methods. The quality selection filter based on b-to-y ions ratio proved to be an effective way to select high quality cross-linked spectra.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter