Catalogue Search | MBRL

Translational biomarker discovery in clinical metabolomics: An introductory tutorial

by Broadhurst, D.I , Xia, J , Wilson, M in acylcarnitine , amino acid , analytic method

2013

Metabolomics is increasingly being applied towards the identification of biomarkers for disease diagnosis, prognosis and risk prediction. Unfortunately among the many published metabolomic studies focusing on biomarker discovery, there is very little consistency and relatively little rigor in how researchers select, assess or report their candidate biomarkers. In particular, few studies report any measure of sensitivity, specificity, or provide receiver operator characteristic (ROC) curves with associated confidence intervals. Even fewer studies explicitly describe or release the biomarker model used to generate their ROC curves. This is surprising given that for biomarker studies in most other biomedical fields, ROC curve analysis is generally considered the standard method for performance assessment. Because the ultimate goal of biomarker discovery is the translation of those biomarkers to clinical practice, it is clear that the metabolomics community needs to start \"speaking the same language\" in terms of biomarker analysis and reporting-especially if it wants to see metabolite markers being routinely used in the clinic. In this tutorial, we will first introduce the concept of ROC curves and describe their use in single biomarker analysis for clinical chemistry. This includes the construction of ROC curves, understanding the meaning of area under ROC curves (AUC) and partial AUC, as well as the calculation of confidence intervals. The second part of the tutorial focuses on biomarker analyses within the context of metabolomics. This section describes different statistical and machine learning strategies that can be used to create multi-metabolite biomarker models and explains how these models can be assessed using ROC curves. In the third part of the tutorial we discuss common issues and potential pitfalls associated with different analysis methods and provide readers with a list of nine recommendations for biomarker analysis and reporting. To help readers test, visualize and explore the concepts presented in this tutorial, we also introduce a web-based tool called ROCCET (ROC Curve Explorer & Tester, http://www. roccet. ca). ROCCET was originally developed as a teaching aid but it can also serve as a training and testing resource to assist metabolomics researchers build biomarker models and conduct a range of common ROC curve analyses for biomarker studies. © 2012 The Author(s).

Journal Article

Share this book

Add to My Shelf

The application of artificial neural networks in metabolomics: a historical perspective

by Reinke, Stacey N , Broadhurst, David I , Mendez, Kevin M in Computer applications , Learning algorithms , Metabolomics

2019

BackgroundMetabolomics data, with its complex covariance structure, is typically modelled by projection-based machine learning (ML) methods such as partial least squares (PLS) regression, which project data into a latent structure. Biological data are often non-linear, so it is reasonable to hypothesize that metabolomics data may also have a non-linear latent structure, which in turn would be best modelled using non-linear equations. A non-linear ML method with a similar projection equation structure to PLS is artificial neural networks (ANNs). While ANNs were first applied to metabolic profiling data in the 1990s, the lack of community acceptance combined with limitations in computational capacity and the lack of volume of data for robust non-linear model optimisation inhibited their widespread use. Due to recent advances in computational power, modelling improvements, community acceptance, and the more demanding needs for data science, ANNs have made a recent resurgence in interest across research communities, including a small yet growing usage in metabolomics. As metabolomics experiments become more complex and start to be integrated with other omics data, there is potential for ANNs to become a viable alternative to linear projection methods.Aim of reviewWe aim to first describe ANNs and their structural equivalence to linear projection-based methods, including PLS regression. We then review the historical, current, and future uses of ANNs in the field of metabolomics.Key scientific concept of reviewIs metabolomics ready for the return of artificial neural networks?

Journal Article

Share this book

Add to My Shelf

Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

by Reinke, Stacey N , Broadhurst, David I , Pritchard, Leighton in Cloud computing , Computer applications , Data science

2019

BackgroundA lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike.Aim of ReviewTo encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science.Key Scientific Concepts of ReviewThis tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

Journal Article

Share this book

Add to My Shelf

Statistical strategies for avoiding false discoveries in metabolomics and related experiments

by Kell, Douglas B , Broadhurst, David I in bias , Biomarkers , Bonferroni correction

2006

Many metabolomics, and other high-content or high-throughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched 'case' and 'control' samples. However, it is unfortunately very easy to find markers that are apparently persuasive but that are in fact entirely spurious, and there are well-known examples in the proteomics literature. The main types of danger are not entirely independent of each other, but include bias, inadequate sample size (especially relative to the number of metabolite variables and to the required statistical power to prove that a biomarker is discriminant), excessive false discovery rate due to multiple hypothesis testing, inappropriate choice of particular numerical methods, and overfitting (generally caused by the failure to perform adequate validation and cross-validation). Many studies fail to take these into account, and thereby fail to discover anything of true significance (despite their claims). We summarise these problems, and provide pointers to a substantial existing literature that should assist in the improved design and evaluation of metabolomics experiments, thereby allowing robust scientific conclusions to be drawn from the available data. We provide a list of some of the simpler checks that might improve one's confidence that a candidate biomarker is not simply a statistical artefact, and suggest a series of preferred tests and visualisation tools that can assist readers and authors in assessing papers. These tools can be applied to individual metabolites by using multiple univariate tests performed in parallel across all metabolite peaks. They may also be applied to the validation of multivariate models. We stress in particular that classical p-values such as “p < 0.05”, that are often used in biomedicine, are far too optimistic when multiple tests are done simultaneously (as in metabolomics). Ultimately it is desirable that all data and metadata are available electronically, as this allows the entire community to assess conclusions drawn from them. These analyses apply to all high-dimensional 'omics' datasets.

Journal Article

Share this book

Add to My Shelf

A roadmap to precision medicine through post-genomic electronic medical records

by Reinke, Stacey N. , Chen, Qingwen , McGeachie, Michael in 21st century , 631/208 , 692/700

2025

The promise of integrating Electronic Medical Records (EMR) and genetic data for precision medicine has largely fallen short due to its omission of environmental context over time. Post-genomic data can bridge this gap by capturing the real-time dynamic relationship between underlying genetics and the environment. This perspective highlights the pivotal role of integrating EMR and post-genomics for personalized health, reflecting on lessons from past efforts, and outlining a roadmap of challenges and opportunities that must be addressed to realize the potential of precision medicine. The authors outline a framework uniting electronic medical records with post-genomic data to capture real-time physiological changes via periodic molecular snapshots, enabling a shift from reactive treatments to proactive, inclusive care.

Journal Article

Share this book

Add to My Shelf

The Metabolomic Profile of Umbilical Cord Blood in Neonatal Hypoxic Ischaemic Encephalopathy

by Wishart, David S. , Walsh, Brian H. , Murray, Deirdre M. in Amino acids , Asphyxia , Babies

2012

Hypoxic ischaemic encephalopathy (HIE) in newborns can cause significant long-term neurological disability. The insult is a complex injury characterised by energy failure and disruption of cellular homeostasis, leading to mitochondrial damage. The importance of individual metabolic pathways, and their interaction in the disease process is not fully understood. The aim of this study was to describe and quantify the metabolomic profile of umbilical cord blood samples in a carefully defined population of full-term infants with HIE. The injury severity was defined using both the modified Sarnat score and continuous multichannel electroencephalogram. Using these classification systems, our population was divided into those with confirmed HIE (n = 31), asphyxiated infants without encephalopathy (n = 40) and matched controls (n = 71). All had umbilical cord blood drawn and biobanked at -80 °C within 3 hours of delivery. A combined direct injection and LC-MS/MS assay (AbsolutIDQ p180 kit, Biocrates Life Sciences AG, Innsbruck, Austria) was used for the metabolomic analyses of the samples. Targeted metabolomic analysis showed a significant alteration between study groups in 29 metabolites from 3 distinct classes (Amino Acids, Acylcarnitines, and Glycerophospholipids). 9 of these metabolites were only significantly altered between neonates with Hypoxic ischaemic encephalopathy and matched controls, while 14 were significantly altered in both study groups. Multivariate Discriminant Analysis models developed showed clear multifactorial metabolite associations with both asphyxia and HIE. A logistic regression model using 5 metabolites clearly delineates severity of asphyxia and classifies HIE infants with AUC = 0.92. These data describe wide-spread disruption to not only energy pathways, but also nitrogen and lipid metabolism in both asphyxia and HIE. This study shows that a multi-platform targeted approach to metabolomic analyses using accurately phenotyped and meticulously biobanked samples provides insight into the pathogenesis of perinatal asphyxia. It highlights the potential for metabolomic technology to develop a diagnostic test for HIE.

Journal Article

Share this book

Add to My Shelf

Quality assurance and quality control processes: summary of a metabolomics community questionnaire

by Dunn, Warwick B. , Guillou, Claude , Viant, Mark R. in Biochemistry , Biomedical and Life Sciences , Biomedicine

2017

Introduction The Metabolomics Society Data Quality Task Group (DQTG) developed a questionnaire regarding quality assurance (QA) and quality control (QC) to provide baseline information about current QA and QC practices applied in the international metabolomics community. Objectives The DQTG has a long-term goal of promoting robust QA and QC in the metabolomics community through increased awareness via communication, outreach and education, and through the promotion of best working practices. An assessment of current QA and QC practices will serve as a foundation for future activities and development of appropriate guidelines. Method QA was defined as the set of procedures that are performed in advance of analysis of samples and that are used to improve data quality. QC was defined as the set of activities that a laboratory does during or immediately after analysis that are applied to demonstrate the quality of project data. A questionnaire was developed that included 70 questions covering demographic information, QA approaches and QC approaches and allowed all respondents to answer a subset or all of the questions. Result The DQTG questionnaire received 97 individual responses from 84 institutions in all fields of metabolomics covering NMR, LC-MS, GC-MS, and other analytical technologies. Conclusion There was a vast range of responses concerning the use of QA and QC approaches that indicated the limited availability of suitable training, lack of Standard Operating Procedures (SOPs) to review and make decisions on quality, and limited use of standard reference materials (SRMs) as QC materials. The DQTG QA/QC questionnaire has for the first time demonstrated that QA and QC usage is not uniform across metabolomics laboratories. Here we present recommendations on how to address the issues concerning QA and QC measurements and reporting in metabolomics.

Journal Article

Share this book

Add to My Shelf

A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification

by Reinke, Stacey N , Broadhurst, David I , Mendez, Kevin M in Algorithms , Artificial intelligence , Choice learning

2019

IntroductionMetabolomics is increasingly being used in the clinical setting for disease diagnosis, prognosis and risk prediction. Machine learning algorithms are particularly important in the construction of multivariate metabolite prediction. Historically, partial least squares (PLS) regression has been the gold standard for binary classification. Nonlinear machine learning methods such as random forests (RF), kernel support vector machines (SVM) and artificial neural networks (ANN) may be more suited to modelling possible nonlinear metabolite covariance, and thus provide better predictive models.ObjectivesWe hypothesise that for binary classification using metabolomics data, non-linear machine learning methods will provide superior generalised predictive ability when compared to linear alternatives, in particular when compared with the current gold standard PLS discriminant analysis.MethodsWe compared the general predictive performance of eight archetypal machine learning algorithms across ten publicly available clinical metabolomics data sets. The algorithms were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks.ResultsThere was only marginal improvement in predictive ability for SVM and ANN over PLS across all data sets. RF performance was comparatively poor. The use of out-of-bag bootstrap confidence intervals provided a measure of uncertainty of model prediction such that the quality of metabolomics data was observed to be a bigger influence on generalised performance than model choice.ConclusionThe size of the data set, and choice of performance metric, had a greater influence on generalised predictive performance than the choice of machine learning algorithm.

Journal Article

Share this book

Add to My Shelf

Direct infusion mass spectrometry metabolomics dataset: a benchmark for data processing and quality control

by Viant, Mark R , Broadhurst, David I , Kirwan, Jennifer A in 631/114/2402 , 631/45/320 , Animals

2014

Direct-infusion mass spectrometry (DIMS) metabolomics is an important approach for characterising molecular responses of organisms to disease, drugs and the environment. Increasingly large-scale metabolomics studies are being conducted, necessitating improvements in both bioanalytical and computational workflows to maintain data quality. This dataset represents a systematic evaluation of the reproducibility of a multi-batch DIMS metabolomics study of cardiac tissue extracts. It comprises of twenty biological samples (cow vs. sheep) that were analysed repeatedly, in 8 batches across 7 days, together with a concurrent set of quality control (QC) samples. Data are presented from each step of the workflow and are available in MetaboLights. The strength of the dataset is that intra- and inter-batch variation can be corrected using QC spectra and the quality of this correction assessed independently using the repeatedly-measured biological samples. Originally designed to test the efficacy of a batch-correction algorithm, it will enable others to evaluate novel data processing algorithms. Furthermore, this dataset serves as a benchmark for DIMS metabolomics, derived using best-practice workflows and rigorous quality assessment. Design Type(s) replicate design • reference design • parallel group design Measurement Type(s) metabolite profiling Technology Type(s) mass spectrometry assay Factor Type(s) Batch • material entity • Day of assay Sample Characteristic(s) Bos taurus • Ovis aries • Right ventricle of heart Machine-accessible metadata file describing the reported data (ISA-Tab format)

Journal Article

Share this book

Add to My Shelf

Evidence That Multiple Defects in Lipid Regulation Occur before Hyperglycemia during the Prodrome of Type-2 Diabetes

by Dunn, Warwick B. , Banerjee, Moulinath , Brown, Marie in Adiponectin , Adiponectin - blood , Adipose Tissue - metabolism

2014

Blood-vessel dysfunction arises before overt hyperglycemia in type-2 diabetes (T2DM). We hypothesised that a metabolomic approach might identify metabolites/pathways perturbed in this pre-hyperglycemic phase. To test this hypothesis and for specific metabolite hypothesis generation, serum metabolic profiling was performed in young women at increased, intermediate and low risk of subsequent T2DM. Participants were stratified by glucose tolerance during a previous index pregnancy into three risk-groups: overt gestational diabetes (GDM; n = 18); those with glucose values in the upper quartile but below GDM levels (UQ group; n = 45); and controls (n = 43, below the median glucose values). Follow-up serum samples were collected at a mean 22 months postnatally. Samples were analysed in a random order using Ultra Performance Liquid Chromatography coupled to an electrospray hybrid LTQ-Orbitrap mass spectrometer. Statistical analysis included principal component (PCA) and multivariate methods. Significant between-group differences were observed at follow-up in waist circumference (86, 95%CI (79-91) vs 80 (76-84) cm for GDM vs controls, p<0.05), adiponectin (about 33% lower in GDM group, p = 0.004), fasting glucose, post-prandial glucose and HbA1c, but the latter 3 all remained within the 'normal' range. Substantial differences in metabolite profiles were apparent between the 2 'at-risk' groups and controls, particularly in concentrations of phospholipids (4 metabolites with p ≤ 0.01), acylcarnitines (3 with p ≤ 0.02), short- and long-chain fatty acids (3 with p< = 0.03), and diglycerides (4 with p ≤ 0.05). Defects in adipocyte function from excess energy storage as relatively hypoxic visceral and hepatic fat, and impaired mitochondrial fatty acid oxidation may initiate the observed perturbations in lipid metabolism. Together with evidence from the failure of glucose-directed treatments to improve cardiovascular outcomes, these data and those of others indicate that a new, quite different definition of type-2 diabetes is required. This definition would incorporate disturbed lipid metabolism prior to hyperglycemia.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter