Catalogue Search | MBRL

A deep proteome and transcriptome abundance atlas of 29 healthy human tissues

by Meng, Chen , Hahne, Hannes , Eraslan, Basak in Amino acids , Biomarkers , Brain research

2019

Genome‐, transcriptome‐ and proteome‐wide measurements provide insights into how biological systems are regulated. However, fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we generated a quantitative proteome and transcriptome abundance atlas of 29 paired healthy human tissues from the Human Protein Atlas project representing human genes by 18,072 transcripts and 13,640 proteins including 37 without prior protein‐level evidence. The analysis revealed that hundreds of proteins, particularly in testis, could not be detected even for highly expressed mRNAs, that few proteins show tissue‐specific expression, that strong differences between mRNA and protein quantities within and across tissues exist and that protein expression is often more stable across tissues than that of transcripts. Only 238 of 9,848 amino acid variants found by exome sequencing could be confidently detected at the protein level showing that proteogenomics remains challenging, needs better computational methods and requires rigorous validation. Many uses of this resource can be envisaged including the study of gene/protein expression regulation and biomarker specificity evaluation. Synopsis Proteome and transcriptome quantification across tissues reveals which human genes exist as transcripts and proteins, where they are expressed and in which approximate quantities. Tissue‐specific protein expression is found to be a rare and quantitative rather than qualitative characteristic. The study presents the most comprehensive atlas of protein expression to date, across 29 healthy human tissues. Protein level evidence is provided for 13,640 genes and 15,257 isoforms, including 37 missing proteins. Tissue‐specific protein expression is rare and quantitative rather than qualitative characteristic. Proteogenomics is still challenging and needs rigorous validation by synthetic peptides. Graphical Abstract Proteome and transcriptome quantification across tissues reveals which human genes exist as transcripts and proteins, where they are expressed and in which approximate quantities. Tissue‐specific protein expression is found to be a rare and quantitative rather than qualitative characteristic.

Journal Article

Share this book

Add to My Shelf

Generating high quality libraries for DIA MS with empirically corrected peptide predictions

by Wilhelm, Mathias , Swearingen, Kristian E. , Küster, Bernhard in 631/114/2784 , 631/1647/334/2246 , 631/45/612/1248

2020

Data-independent acquisition approaches typically rely on experiment-specific spectrum libraries, requiring offline fractionation and tens to hundreds of injections. We demonstrate a library generation workflow that leverages fragmentation and retention time prediction to build libraries containing every peptide in a proteome, and then refines those libraries with empirical data. Our method specifically enables rapid, experiment-specific library generation for non-model organisms, which we demonstrate using the malaria parasite Plasmodium falciparum , and non-canonical databases, which we show by detecting missense variants in HeLa. Data-independent acquisition-mass spectrometry (MS) typically requires many preparatory MS runs to produce experiment-specific spectral libraries. Here, the authors show that empirical correction of in silico predicted spectral libraries enables efficient generation of high-quality experiment-specific libraries.

Journal Article

Share this book

Add to My Shelf

Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning

by Ehrlich Hans-Christian , Schnatbaum Karsten , Gessulat Siegfried in Artificial neural networks , Database searching , Deep learning

2019

In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools synthetic peptide library to 550,000 tryptic peptides and 21 million high-quality tandem mass spectra. We trained a deep neural network, termed Prosit, resulting in chromatographic retention time and fragment ion intensity predictions that exceed the quality of the experimental data. Integrating Prosit into database search pipelines led to more identifications at >10× lower false discovery rates. We show the general applicability of Prosit by predicting spectra for proteases other than trypsin, generating spectral libraries for data-independent acquisition and improving the analysis of metaproteomes. Prosit is integrated into ProteomicsDB, allowing search result re-scoring and custom spectral library generation for any organism on the basis of peptide sequence alone.A deep learning–based tool, Prosit, predicts high-quality peptide tandem mass spectra, improving peptide-identification performance compared with that of traditional proteomics analysis methods.

Journal Article

Share this book

Add to My Shelf

Exploring crop genomes: assembly features, gene prediction accuracy, and implications for proteomics studies

by Frishman, Dmitrij , Wilhelm, Mathias , Abbas, Qussai in Accuracy , Analysis , Animal behavior

2024

Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation.

Journal Article

Share this book

Add to My Shelf

PeriSense: Ring-Based Multi-Finger Gesture Interaction Utilizing Capacitive Proximity Sensing

by Wilhelm, Mathias , Krakowczyk, Daniel , Albayrak, Sahin in Adult , Cameras , capacitive sensing

2020

Rings are widely accepted wearables for gesture interaction. However, most rings can sense only the motion of one finger or the whole hand. We present PeriSense, a ring-shaped interaction device enabling multi-finger gesture interaction. Gestures of the finger wearing ring and its adjacent fingers are sensed by measuring capacitive proximity between electrodes and human skin. Our main contribution is the determination of PeriSense’s interaction space involving the evaluation of capabilities and limitations. We introduce a prototype named PeriSense, analyze the sensor resolution at different distances, and evaluate finger gestures and unistroke gestures based on gesture sets allowing the determination of the strengths and limitations. We show that PeriSense is able to sense the change of conductive objects reliably up to 2.5 cm. Furthermore, we show that this capability enables different interaction techniques such as multi-finger gesture recognition or two-handed unistroke input.

Journal Article

Share this book

Add to My Shelf

Mass-spectrometry-based draft of the Arabidopsis proteome

by Dunkel, Andreas , Sprunck, Stefanie , List, Markus in 38/39 , 38/91 , 631/1647/2067

2020

Plants are essential for life and are extremely diverse organisms with unique molecular capabilities 1 . Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana . Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where they are expressed, in which approximate quantities (a dynamic range of more than six orders of magnitude) and to what extent they are phosphorylated (over 43,000 sites). We present examples of how the data may be used, such as to discover proteins that are translated from short open-reading frames, to uncover sequence motifs that are involved in the regulation of protein production, and to identify tissue-specific protein complexes or phosphorylation-mediated signalling events. Interactive access to this resource for the plant community is provided by the ProteomicsDB and ATHENA databases, which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interactions. A quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana provides a valuable resource for plant research.

Journal Article

Share this book

Add to My Shelf

Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF

by Wilhelm, Mathias , Bittremieux, Wout , Gabriel, Wassim in 119/118 , 49/75 , 631/114/1305

2024

Immunopeptidomics is crucial for immunotherapy and vaccine development. Because the generation of immunopeptides from their parent proteins does not adhere to clear-cut rules, rather than being able to use known digestion patterns, every possible protein subsequence within human leukocyte antigen (HLA) class-specific length restrictions needs to be considered during sequence database searching. This leads to an inflation of the search space and results in lower spectrum annotation rates. Peptide-spectrum match (PSM) rescoring is a powerful enhancement of standard searching that boosts the spectrum annotation performance. We analyze 302,105 unique synthesized non-tryptic peptides from the ProteomeTools project on a timsTOF-Pro to generate a ground-truth dataset containing 93,227 MS/MS spectra of 74,847 unique peptides, that is used to fine-tune the deep learning-based fragment ion intensity prediction model Prosit. We demonstrate up to 3-fold improvement in the identification of immunopeptides, as well as increased detection of immunopeptides from low input samples. Immunopeptidomics is crucial for the discovery of potential immunotherapy and vaccine candidates. Here, the authors generate a ground truth timsTOF dataset to fine-tune the deep learning model Prosit, improving peptide-spectrum match rescoring by up to 3-fold during immunopeptide identification.

Journal Article

Share this book

Add to My Shelf

Mass-spectrometry-based draft of the human proteome

by Hahne, Hannes , Wenschuh, Holger , Gerstmair, Anja in 631/45/475 , Analysis , Body Fluids - chemistry

2014

Proteomes are characterized by large protein-abundance differences, cell-type- and time-dependent expression patterns and post-translational modifications, all of which carry biological information that is not accessible by genomics or transcriptomics. Here we present a mass-spectrometry-based draft of the human proteome and a public, high-performance, in-memory database for real-time analysis of terabytes of big data, called ProteomicsDB. The information assembled from human tissues, cell lines and body fluids enabled estimation of the size of the protein-coding genome, and identified organ-specific proteins and a large number of translated lincRNAs (long intergenic non-coding RNAs). Analysis of messenger RNA and protein-expression profiles of human tissues revealed conserved control of protein abundance, and integration of drug-sensitivity data enabled the identification of proteins predicting resistance or sensitivity. The proteome profiles also hold considerable promise for analysing the composition and stoichiometry of protein complexes. ProteomicsDB thus enables navigation of proteomes, provides biological insight and fosters the development of proteomic technology. A mass-spectrometry-based draft of the human proteome and a public database for analysis of proteome data are presented; assembled information is used to estimate the size of the protein-coding genome, to identify organ-specific proteins, proteins predicting drug resistance or sensitivity, and many translated long intergenic non-coding RNAs, and to reveal conserved control of protein abundance. Mapping the human proteome More than a decade after publication of the draft human genome sequence, there is no direct equivalent for the human proteome. But in this issue of Nature two groups present mass spectrometry-based analysis of human tissues, body fluids and cells mapping the large majority of the human proteome. Akhilesh Pandey and colleagues identified 17,294 protein-coding genes and provide evidence of tissue- and cell-restricted proteins through expression profiling. They highlight the importance of proteogenomic analysis by identifying translated proteins from annotated pseudogenes, non-coding RNAs and untranslated regions. The data set is available on http://www.humanproteomemap.org . Bernhard Kuster and colleagues have assembled protein evidence for 18,097 genes in ProteomicsDB (available on https://www.proteomicsdb.org ) and highlight the utility of the data, for example the identification of hundreds of translated lincRNAs, drug-sensitivity markers and discovering the quantitative relationship between mRNA and protein levels in tissues. Elsewhere in this issue, Vivien Marx reports on a third major proteomics project, the antibody-based Human Protein Atlas programme ( http://www.proteinatlas.org/ ).

Journal Article

Share this book

Add to My Shelf

Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing

by Wilhelm, Mathias , Hingerl, Johannes , Klaproth-Andrade, Daniela in 631/114/1305 , 631/114/2784 , 631/1647/296

2024

Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spectralis leverages several innovations including a convolutional neural network layer connecting peaks in spectra spaced by amino acid masses, proposing fragment ion series classification as a pivotal task for de novo peptide sequencing, and a peptide-spectrum confidence score. On spectra for which database search provided a ground truth, Spectralis surpassed 40% sensitivity at 90% precision, nearly doubling state-of-the-art sensitivity. Application to unidentified spectra confirmed its superiority and showcased its applicability to variant calling. Altogether, these algorithmic innovations and the substantial sensitivity increase in the high-precision range constitute an important step toward broadly applicable peptide sequencing. Accurate and high-throughput sequencing methods for proteins are lacking. Here the authors report Spectralis which improves de novo peptide sequencing using a convolutional layer that connects peaks in spectra spaced by amino acid masses, fragment ion series classification and a peptide-spectrum match confidence score.

Journal Article

Share this book

Add to My Shelf

Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics

by Schwencke-Westphal, Celina , Huhmer, Andreas , Wenschuh, Holger in 631/114/1305 , 631/1647/296 , 631/250/21/324

2021

Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows. The identification of HLA peptides by mass spectrometry is non-trivial. Here, the authors extended and used the wealth of data from the ProteomeTools project to improve the prediction of non-tryptic peptides using deep learning, and show their approach enables a variety of immunological discoveries.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter