Catalogue Search | MBRL

Integrated pathway mining and selection of an artificial CYP79-mediated bypass to improve benzylisoquinoline alkaloid biosynthesis

by Akira Nakagawa , Hiromichi Minami , Akihiko Kondo in 3,4-dihydroxyphenylacetaldoxime , 3,4-Dihydroxyphenylacetic Acid - analogs & derivatives , 3,4-Dihydroxyphenylacetic Acid - metabolism

2024

Background Computational mining of useful enzymes and biosynthesis pathways is a powerful strategy for metabolic engineering. Through systematic exploration of all conceivable combinations of enzyme reactions, including both known compounds and those inferred from the chemical structures of established reactions, we can uncover previously undiscovered enzymatic processes. The application of the novel alternative pathways enables us to improve microbial bioproduction by bypassing or reinforcing metabolic bottlenecks. Benzylisoquinoline alkaloids (BIAs) are a diverse group of plant-derived compounds with important pharmaceutical properties. BIA biosynthesis has developed into a prime example of metabolic engineering and microbial bioproduction. The early bottleneck of BIA production in Escherichia coli consists of 3,4-dihydroxyphenylacetaldehyde (DHPAA) production and conversion to tetrahydropapaveroline (THP). Previous studies have selected monoamine oxidase (MAO) and DHPAA synthase (DHPAAS) to produce DHPAA from dopamine and oxygen; however, both of these enzymes produce toxic hydrogen peroxide as a byproduct. Results In the current study, in silico pathway design is applied to relieve the bottleneck of DHPAA production in the synthetic BIA pathway. Specifically, the cytochrome P450 enzyme, tyrosine N -monooxygenase (CYP79), is identified to bypass the established MAO- and DHPAAS-mediated pathways in an alternative arylacetaldoxime route to DHPAA with a peroxide-independent mechanism. The application of this pathway is proposed to result in less formation of toxic byproducts, leading to improved production of reticuline (up to 60 mg/L at the flask scale) when compared with that from the conventional MAO pathway. Conclusions This study showed improved reticuline production using the bypass pathway predicted by the M-path computational platform. Reticuline production in E. coli exceeded that of the conventional MAO-mediated pathway. The study provides a clear example of the integration of pathway mining and enzyme design in creating artificial metabolic pathways and suggests further potential applications of this strategy in metabolic engineering.

Journal Article

Share this book

Add to My Shelf

Investigating RNA editing in deep transcriptome datasets with REDItools and REDIportal

by Pesole, Graziano , Picardi, Ernesto , Tangaro, Marco Antonio in 631/114/2163 , 631/1647/48 , 631/337

2020

RNA editing is a widespread post-transcriptional mechanism able to modify transcripts through insertions/deletions or base substitutions. It is prominent in mammals, in which millions of adenosines are deaminated to inosines by members of the ADAR family of enzymes. A-to-I RNA editing has a plethora of biological functions, but its detection in large-scale transcriptome datasets is still an unsolved computational task. To this aim, we developed REDItools, the first software package devoted to the RNA editing profiling in RNA-sequencing (RNAseq) data. It has been successfully used in human transcriptomes, proving the tissue and cell type specificity of RNA editing as well as its pervasive nature. Outcomes from large-scale REDItools analyses on human RNAseq data have been collected in our specialized REDIportal database, containing more than 4.5 million events. Here we describe in detail two bioinformatic procedures based on our computational resources, REDItools and REDIportal. In the first procedure, we outline a workflow to detect RNA editing in the human cell line NA12878, for which transcriptome and whole genome data are available. In the second procedure, we show how to identify dysregulated editing at specific recoding sites in post-mortem brain samples of Huntington disease donors. On a 64-bit computer running Linux with ≥32 GB of random-access memory (RAM), both procedures should take ~76 h, using 4 to 24 cores. Our protocols have been designed to investigate RNA editing in different organisms with available transcriptomic and/or genomic reads. Scripts to complete both procedures and a docker image are available at https://github.com/BioinfoUNIBA/REDItools . This protocol describes bioinformatics procedures to detect RNA editing in RNA-sequencing datasets using REDItools and REDIportal. REDItools is a software package to profile RNA editing, while known editing sites are collected in the REDIportal database.

Journal Article

Share this book

Add to My Shelf

FELLA: an R package to enrich metabolomics data

by Fernández-Albert, Francesc , Perera-Lluna, Alexandre , Yanes, Oscar in Algorithms , Animal models , Biochemistry

2018

Background Pathway enrichment techniques are useful for understanding experimental metabolomics data. Their purpose is to give context to the affected metabolites in terms of the prior knowledge contained in metabolic pathways. However, the interpretation of a prioritized pathway list is still challenging, as pathways show overlap and cross talk effects. Results We introduce FELLA, an R package to perform a network-based enrichment of a list of affected metabolites. FELLA builds a hierarchical representation of an organism biochemistry from the Kyoto Encyclopedia of Genes and Genomes (KEGG), containing pathways, modules, enzymes, reactions and metabolites. In addition to providing a list of pathways, FELLA reports intermediate entities (modules, enzymes, reactions) that link the input metabolites to them. This sheds light on pathway cross talk and potential enzymes or metabolites as targets for the condition under study. FELLA has been applied to six public datasets –three from Homo sapiens , two from Danio rerio and one from Mus musculus – and has reproduced findings from the original studies and from independent literature. Conclusions The R package FELLA offers an innovative enrichment concept starting from a list of metabolites, based on a knowledge graph representation of the KEGG database that focuses on interpretability. Besides reporting a list of pathways, FELLA suggests intermediate entities that are of interest per se. Its usefulness has been shown at several molecular levels on six public datasets, including human and animal models. The user can run the enrichment analysis through a simple interactive graphical interface or programmatically. FELLA is publicly available in Bioconductor under the GPL-3 license.

Journal Article

Share this book

Add to My Shelf

IP4M: an integrated platform for mass spectrometry-based metabolomics data mining

by Liang, Dandan , Zhou, Kejun , Xie, Guoxiang in Algorithms , Annotations , Bioinformatics

2020

Background Metabolomics data analyses rely on the use of bioinformatics tools. Many integrated multi-functional tools have been developed for untargeted metabolomics data processing and have been widely used. More alternative platforms are expected for both basic and advanced users. Results Integrated mass spectrometry-based untargeted metabolomics data mining (IP4M) software was designed and developed. The IP4M, has 62 functions categorized into 8 modules, covering all the steps of metabolomics data mining, including raw data preprocessing (alignment, peak de-convolution, peak picking, and isotope filtering), peak annotation, peak table preprocessing, basic statistical description, classification and biomarker detection, correlation analysis, cluster and sub-cluster analysis, regression analysis, ROC analysis, pathway and enrichment analysis, and sample size and power analysis. Additionally, a KEGG-derived metabolic reaction database was embedded and a series of ratio variables (product/substrate) can be generated with enlarged information on enzyme activity. A new method, GRaMM, for correlation analysis between metabolome and microbiome data was also provided. IP4M provides both a number of parameters for customized and refined analysis (for expert users), as well as 4 simplified workflows with few key parameters (for beginners who are unfamiliar with computational metabolomics). The performance of IP4M was evaluated and compared with existing computational platforms using 2 data sets derived from standards mixture and 2 data sets derived from serum samples, from GC–MS and LC–MS respectively. Conclusion IP4M is powerful, modularized, customizable and easy-to-use. It is a good choice for metabolomics data processing and analysis. Free versions for Windows, MAC OS, and Linux systems are provided.

Journal Article

Share this book

Add to My Shelf

Statistical methods and resources for biomarker discovery using metabolomics

by Elrayess, Mohamed A. , Althani, Asma A. , Diboun, Ilhame in Algorithms , Analysis , Analytical workflow

2023

Metabolomics is a dynamic tool for elucidating biochemical changes in human health and disease. Metabolic profiles provide a close insight into physiological states and are highly volatile to genetic and environmental perturbations. Variation in metabolic profiles can inform mechanisms of pathology, providing potential biomarkers for diagnosis and assessment of the risk of contracting a disease. With the advancement of high-throughput technologies, large-scale metabolomics data sources have become abundant. As such, careful statistical analysis of intricate metabolomics data is essential for deriving relevant and robust results that can be deployed in real-life clinical settings. Multiple tools have been developed for both data analysis and interpretations. In this review, we survey statistical approaches and corresponding statistical tools that are available for discovery of biomarkers using metabolomics.

Journal Article

Share this book

Add to My Shelf

DDIGIP: predicting drug-drug interactions based on Gaussian interaction profile kernels

by Yan, Cheng , Pan, Yi , Wu, Fang-Xiang in Algorithms , Area Under Curve , Bioinformatics

2019

Background A drug-drug interaction (DDI) is defined as a drug effect modified by another drug, which is very common in treating complex diseases such as cancer. Many studies have evidenced that some DDIs could be an increase or a decrease of the drug effect. However, the adverse DDIs maybe result in severe morbidity and even morality of patients, which also cause some drugs to withdraw from the market. As the multi-drug treatment becomes more and more common, identifying the potential DDIs has become the key issue in drug development and disease treatment. However, traditional biological experimental methods, including in vitro and vivo, are very time-consuming and expensive to validate new DDIs. With the development of high-throughput sequencing technology, many pharmaceutical studies and various bioinformatics data provide unprecedented opportunities to study DDIs. Result In this study, we propose a method to predict new DDIs, namely DDIGIP, which is based on Gaussian Interaction Profile (GIP) kernel on the drug-drug interaction profiles and the Regularized Least Squares (RLS) classifier. In addition, we also use the k-nearest neighbors (KNN) to calculate the initial relational score in the presence of new drugs via the chemical, biological, phenotypic data of drugs. We compare the prediction performance of DDIGIP with other competing methods via the 5-fold cross validation, 10-cross validation and de novo drug validation. Conlusion In 5-fold cross validation and 10-cross validation, DDRGIP method achieves the area under the ROC curve (AUC) of 0.9600 and 0.9636 which are better than state-of-the-art method (L1 Classifier ensemble method) of 0.9570 and 0.9599. Furthermore, for new drugs, the AUC value of DDIGIP in de novo drug validation reaches 0.9262 which also outperforms the other state-of-the-art method (Weighted average ensemble method) of 0.9073. Case studies and these results demonstrate that DDRGIP is an effective method to predict DDIs while being beneficial to drug development and disease treatment.

Journal Article

Share this book

Add to My Shelf

Precursor peptide-targeted mining of more than one hundred thousand genomes expands the lanthipeptide natural product family

by Hetrick, Kenton J. , Walker, Mark C. , van der Donk, Wilfred A. in Animal Genetics and Genomics , Annotations , Antibiotic

2020

Background Lanthipeptides belong to the ribosomally synthesized and post-translationally modified peptide group of natural products and have a variety of biological activities ranging from antibiotics to antinociceptives. These peptides are cyclized through thioether crosslinks and can bear other secondary post-translational modifications. While lanthipeptide biosynthetic gene clusters can be identified by the presence of genes encoding characteristic enzymes involved in the post-translational modification process, locating the precursor peptides encoded within these clusters is challenging due to their short length and high sequence variability, which limits the high-throughput exploration of lanthipeptide biosynthesis. To address this challenge, we enhanced the predictive capabilities of Rapid ORF Description & Evaluation Online (RODEO) to identify members of all four known classes of lanthipeptides. Results Using RODEO, we mined over 100,000 bacterial and archaeal genomes in the RefSeq database. We identified nearly 8500 lanthipeptide precursor peptides. These precursor peptides were identified in a broad range of bacterial phyla as well as the Euryarchaeota phylum of archaea. Bacteroidetes were found to encode a large number of these biosynthetic gene clusters, despite making up a relatively small portion of the genomes in this dataset. A number of these precursor peptides are similar to those of previously characterized lanthipeptides, but even more were not, including potential antibiotics. One such new antimicrobial lanthipeptide was purified and characterized. Additionally, examination of the biosynthetic gene clusters revealed that enzymes installing secondary post-translational modifications are more widespread than initially thought. Conclusion Lanthipeptide biosynthetic gene clusters are more widely distributed and the precursor peptides encoded within these clusters are more diverse than previously appreciated, demonstrating that the lanthipeptide sequence-function space remains largely underexplored.

Journal Article

Share this book

Add to My Shelf

Accurate prediction of flux distributions compatible with metabolite concentration effects in genome-scale metabolic networks

by Nikoloski, Zoran , Soleymani, Fayaz , Razaghi-Moghadam, Zahra in Biological research , Biology and Life Sciences , Biology, Experimental

2026

Intracellular fluxes shape all cellular functions, and understanding how they are shaped by the joint effects of enzyme abundances and metabolite concentrations in vivo currently requires gathering matched quantitative proteomic and metabolomic data sets from resource-intensive experiments. Here, we present KineFlux, a hybrid approach that combines machine learning with enzyme-constrained metabolic models to accurately predict steady-state flux distributions using only quantitative proteomic data. KineFlux builds machine learning models for metabolite concentration effects on reaction fluxes, obtained by using fluxomics and proteomics data from a training set of experiments. Using fluxomic and proteomic data sets of Escherichia coli and Saccharomyces cerevisiae , we show that the steady-state flux distributions predicted by KineFlux are in line with fluxes estimated by classical approaches. We also demonstrate that the machine learning models embedded in KineFlux are transferrable at marginal loss of accuracy using independent testing data from E. coli . Therefore, KineFlux expands the usability of enzyme-constrained models towards accurate prediction of genome-scale flux distributions compatible with metabolite concentration effects without knowledge of enzyme kinetics.

Journal Article

Share this book

Add to My Shelf

Predicting microbial interactions with approaches based on flux balance analysis: an evaluation

by Zafeiropoulos, Haris , Joseph, Clémence , Bernaerts, Kristel in Accuracy , Algorithms , Animals

2024

Background Given a genome-scale metabolic model (GEM) of a microorganism and criteria for optimization, flux balance analysis (FBA) predicts the optimal growth rate and its corresponding flux distribution for a specific medium. FBA has been extended to microbial consortia and thus can be used to predict interactions by comparing in-silico growth rates for co- and monocultures. Although FBA-based methods for microbial interaction prediction are becoming popular, a systematic evaluation of their accuracy has not yet been performed. Results Here, we evaluate the accuracy of FBA-based predictions of human and mouse gut bacterial interactions using growth data from the literature. For this, we collected 26 GEMs from the semi-curated AGORA database as well as four previously published curated GEMs. We tested the accuracy of three tools (COMETS, Microbiome Modeling Toolbox and MICOM) by comparing growth rates predicted in mono- and co-culture to growth rates extracted from the literature and also investigated the impact of different tool settings and media. We found that except for curated GEMs, predicted growth rates and their ratios (i.e. interaction strengths) do not correlate with growth rates and interaction strengths obtained from in vitro data. Conclusions Prediction of growth rates with FBA using semi-curated GEMs is currently not sufficiently accurate to predict interaction strengths reliably.

Journal Article

Share this book

Add to My Shelf

ECDomainMiner: discovering hidden associations between enzyme commission numbers and Pfam domains

by Alborzi, Seyed Ziaeddin , Ritchie, David W. , Devignes, Marie-Dominique in Algorithms , Bioinformatics , Biomedical and Life Sciences

2017

Background Many entries in the protein data bank (PDB) are annotated to show their component protein domains according to the Pfam classification, as well as their biological function through the enzyme commission (EC) numbering scheme. However, despite the fact that the biological activity of many proteins often arises from specific domain-domain and domain-ligand interactions, current on-line resources rarely provide a direct mapping from structure to function at the domain level. Since the PDB now contains many tens of thousands of protein chains, and since protein sequence databases can dwarf such numbers by orders of magnitude, there is a pressing need to develop automatic structure-function annotation tools which can operate at the domain level. Results This article presents ECDomainMiner, a novel content-based filtering approach to automatically infer associations between EC numbers and Pfam domains. ECDomainMiner finds a total of 20,728 non-redundant EC-Pfam associations with a F-measure of 0.95 with respect to a “Gold Standard” test set extracted from InterPro. Compared to the 1515 manually curated EC-Pfam associations in InterPro, ECDomainMiner infers a 13-fold increase in the number of EC-Pfam associations. Conclusion These EC-Pfam associations could be used to annotate some 58,722 protein chains in the PDB which currently lack any EC annotation. The ECDomainMiner database is publicly available at http://ecdm.loria.fr/ .

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter