Catalogue Search | MBRL

The automated Galaxy-SynBioCAD pipeline for synthetic biology design and engineering

by University of Liverpool , Swainston, Neil , The University of Edinburgh in 49/56 , 631/114/2390 , 631/114/2398

2022

Here we introduce the Galaxy-SynBioCAD portal, a toolshed for synthetic biology, metabolic engineering, and industrial biotechnology. The tools and workflows currently shared on the portal enables one to build libraries of strains producing desired chemical targets covering an end-to-end metabolic pathway design and engineering process from the selection of strains and targets, the design of DNA parts to be assembled, to the generation of scripts driving liquid handlers for plasmid assembly and strain transformations. Standard formats like SBML and SBOL are used throughout to enforce the compatibility of the tools. In a study carried out at four different sites, we illustrate the link between pathway design and engineering with the building of a library of E. coli lycopene-producing strains. We also benchmark our workflows on literature and expert validated pathways. Overall, we find an 83% success rate in retrieving the validated pathways among the top 10 pathways generated by the workflows.

Journal Article

Share this book

Add to My Shelf

neo4jsbml: import systems biology markup language data into the graph database Neo4j

by Gricourt, Guillaume , Dérozier, Sandra , Faulon, Jean-Loup in Biology , Computer Science , Database management systems

2024

Systems Biology Markup Language (SBML) has emerged as a standard for representing biological models, facilitating model sharing and interoperability. It stores many types of data and complex relationships, complicating data management and analysis. Traditional database management systems struggle to effectively capture these complex networks of interactions within biological systems. Graph-oriented databases perform well in managing interactions between different entities. We present neo4jsbml, a new solution that bridges the gap between the Systems Biology Markup Language data and the Neo4j database, for storing, querying and analyzing data. The Systems Biology Markup Language organizes biological entities in a hierarchical structure, reflecting their interdependencies. The inherent graphical structure represents these hierarchical relationships, offering a natural and efficient means of navigating and exploring the model’s components. Neo4j is an excellent solution for handling this type of data. By representing entities as nodes and their relationships as edges, Cypher, Neo4j’s query language, efficiently traverses this type of graph representing complex biological networks. We have developed neo4jsbml, a Python library for importing Systems Biology Markup Language data into a Neo4j database using a user-defined schema. By leveraging Neo4j’s graphical database technology, exploration of complex biological networks becomes intuitive and information retrieval efficient. Neo4jsbml is a tool designed to import Systems Biology Markup Language data into a Neo4j database. Only the desired data is loaded into the Neo4j database. neo4jsbml is user-friendly and can become a useful new companion for visualizing and analyzing metabolic models through the Neo4j graphical database. neo4jsbml is open source software and available at https://github.com/brsynth/neo4jsbml .

Journal Article

Share this book

Add to My Shelf

Reverse engineering molecules from fingerprints through deterministic enumeration and generative models

by Meyer, Philippe , Duigou, Thomas , Gricourt, Guillaume in Life Sciences

2025

Reverse engineering in molecular design aims to identify optimal structures based on activities, or properties, computed through molecular descriptors like fingerprints. This task is known to be particularly difficult for the widely used Extended-Connectivity Fingerprints (ECFPs), due to significant loss of structural information during vectorization. While recent artificial intelligence-based works have raised awareness about the privacy risks associated with ECFP-based data sharing, we contribute a more conclusive demonstration by introducing a deterministic algorithm that reconstructs molecular structures from ECFPs. Using MetaNetX and eMolecules as databases of natural compounds and commercially available chemicals, the deterministic algorithm benchmarks a Transformer-based generative model trained to predict SMILES from ECFPs. The generative model achieves a top-ranked retrieval accuracy of 95.64% but struggles with exhaustive enumeration. Additionally, applying the deterministic method to a drug dataset reveals its potential for de novo drug design, as many of the reverse-engineered structures are found to be patented or supported by bioassay data.

Journal Article

Share this book

Add to My Shelf

Reverse engineering molecules from fingerprints through deterministic enumeration and generative models

by Gricourt, Guillaume , Meyer, Philippe , Faulon, Jean-Loup in Accuracy , Algorithms , Artificial intelligence

2025

Reverse engineering in molecular design aims to identify optimal structures based on activities, or properties, computed through molecular descriptors like fingerprints. This task is known to be particularly difficult for the widely used Extended-Connectivity Fingerprints (ECFPs), due to significant loss of structural information during vectorization. While recent artificial intelligence-based works have raised awareness about the privacy risks associated with ECFP-based data sharing, we contribute a more conclusive demonstration by introducing a deterministic algorithm that reconstructs molecular structures from ECFPs. Using MetaNetX and eMolecules as databases of natural compounds and commercially available chemicals, the deterministic algorithm benchmarks a Transformer-based generative model trained to predict SMILES from ECFPs. The generative model achieves a top-ranked retrieval accuracy of 95.64% but struggles with exhaustive enumeration. Additionally, applying the deterministic method to a drug dataset reveals its potential for de novo drug design, as many of the reverse-engineered structures are found to be patented or supported by bioassay data. Graphical Abstract Scientific contribution We present a deterministic algorithm that reconstructs molecular structures from ECFP vectors, demonstrating that these fingerprints are invertible. In parallel, we benchmark a Transformer-based generative model trained to predict SMILES from ECFPs, showing high accuracy but limitations in chemical space coverage. This dual approach advances reverse engineering in molecular design, offering new tools for de novo drug discovery.

Journal Article

Share this book

Add to My Shelf

Molecular structures enumeration and virtual screening in the chemical space with RetroPath2.0

by Duigou, Thomas , Carbonell, Pablo , Faulon, Jean-Loup in Chemical Sciences

2017

Journal Article

Share this book

Add to My Shelf

Molecular structures enumeration and virtual screening in the chemical space with RetroPath2.0

by Université Paris Saclay (COmUE) , French National Research Agency [ANR-15-CE1-0008] ; Biotechnology and Biological Sciences Research Council, Centre for synthetic biology of fine and speciality chemicals [BB/M017702/1] ; Synthetic Biology Applications for Protective Materials [EP/N025504/1] ; DGA ; Ecole Polytechnique , Duigou, Thomas in Algorithms , Aminoglycosides , Antibacterial activity

2017

Background: Network generation tools coupled with chemical reaction rules have been mainly developed for synthesis planning and more recently for metabolic engineering. Using the same core algorithm, these tools apply a set of rules to a source set of compounds, stopping when a sink set of compounds has been produced. When using the appropriate sink, source and rules, this core algorithm can be used for a variety of applications beyond those it has been developed for. Results: Here, we showcase the use of the open source workflow RetroPath2.0. First, we mathematically prove that we can generate all structural isomers of a molecule using a reduced set of reaction rules. We then use this enumeration strategy to screen the chemical space around a set of monomers and predict their glass transition temperatures, as well as around aminoglycosides to search structures maximizing antibacterial activity. We also perform a screening around aminoglycosides with enzymatic reaction rules to ensure biosynthetic accessibility. We finally use our workflow on an E. coli model to complete E. coli metabolome, with novel molecules generated using promiscuous enzymatic reaction rules. These novel molecules are searched on the MS spectra of an E. coli cell lysate interfacing our workflow with OpenMS through the KNIME Analytics Platform. Conclusion: We provide an easy to use and modify, modular, and open-source workflow. We demonstrate its versatility through a variety of use cases including molecular structure enumeration, virtual screening in the chemical space, and metabolome completion. Because it is open source and freely available on MyExperiment.org, workflow community contributions should likely expand further the features of the tool, even beyond the use cases presented in the paper.

Journal Article

Share this book

Add to My Shelf

Reinforcement Learning for Bio-Retrosynthesis

by Jean-Loup Faulon , Duigou, Thomas , Koch, Mathilde in Bioinformatics , Computer applications , Enzymatic synthesis

2019

Metabolic engineering aims to produce chemicals of interest from living organisms, to advance towards greener chemistry. Despite efforts, the research and development process is still long and costly and efficient computational design tools are required to explore the chemical biosynthetic space. Here, we propose to explore the bio-retrosynthesis space using an Artificial Intelligence based approach relying on the Monte Carlo Tree Search reinforcement learning method, guided by chemical similarity. We implement this method in RetroPath RL, an open-source and modular command line tool. We validate it on a golden dataset of 20 manually curated experimental pathways as well as on a larger dataset of 152 successful metabolic engineering projects. Moreover, we provide a novel feature, that suggests potential media supplements to complement the enzymatic synthesis plan. Footnotes * https://github.com/brsynth/RetroPathRL

Paper

Share this book

Add to My Shelf

PeroxiHUB: a modular cell-free biosensing platform using H2O2 as signal integrator

by Voyvodic, Peter L , Kushwaha, Manish , Soudier, Paul in Adaptability , Biosensors , Hydrogen peroxide

2022

Cell-free systems have great potential for delivering robust, cheap, and field-deployable biosensors. Many cell-free biosensors rely on transcription factors responding to small molecules, but their discovery and implementation still remain challenging. Here we report the engineering of PeroxiHUB, an optimized H2O2-centered sensing platform supporting cell-free detection of different metabolites. H2O2 is a central metabolite and a by-product of numerous enzymatic reactions. PeroxiHUB uses enzymatic transducers to convert metabolites of interest into H2O2, enabling rapid reprogramming of sensor specificity using alternative transducers. We first screen several transcription factors and optimize OxyR for the transcriptional response to H2O2 in cell-free, highlighting the need for pre-incubation steps to obtain suitable signal-to-noise ratios. We then demonstrate modular detection of metabolites of clinical interest -lactate, sarcosine, and choline - using different transducers mined via a custom retro-synthesis workflow publicly available on the SynBioCAD Galaxy portal. We find that expressing the transducer during the pre-incubation step is crucial for optimal sensor operation. Finally, we show that different reporters can be connected to PeroxiHUB, providing high adaptability for various applications. Given the wide range of enzymatic reactions producing H2O2, the PeroxiHUB platform will support cell-free detection of a large number of metabolites in a modular and scalable fashion. Competing Interest Statement The authors have declared no competing interest. Footnotes * https://galaxy-synbiocad.org/root/login?redirect=%2Fworkflows%2Flist_published

Paper

Share this book

Add to My Shelf

Galaxy-SynBioCAD: Automated Pipeline for Synthetic Biology Design and Engineering

by Swainston, Neil , El-Moubayed, Yorgo , Duigou, Thomas in Synthetic Biology

2022

We introduce the Galaxy-SynBioCAD portal, the first toolshed for synthetic biology, metabolic engineering, and industrial biotechnology. The tools and workflows currently shared on the portal enables one to build libraries of strains producing desired chemical targets covering an end-to-end metabolic pathway design and engineering process from the selection of strains and targets, the design of DNA parts to be assembled, to the generation of scripts driving liquid handlers for plasmid assembly and strain transformations. Standard formats like SBML and SBOL are used throughout to enforce the compatibility of the tools. In a study carried out at four different sites, we illustrate the link between pathway design and engineering with the building of a library of E. coli lycopene-producing strains. We also benchmarked our workflows on literature and expert validated pathways. Overall, we find an 83% success rate in retrieving the validated pathways among the top 10 pathways generated by the workflows.

Paper

Share this book

Add to My Shelf

Molecular structures enumeration and virtual screening in the chemical space with RetroPath2.0

by Carbonell, Pablo , Jean-Loup Faulon , Duigou, Thomas in Algorithms , Antibacterial activity , E coli

2017

Background: Network generation tools coupled with chemical reaction rules have been mainly developed for synthesis planning and more recently for metabolic engineering. Using the same core algorithm, these tools apply a set of rules to a source set of compounds, stopping when a sink set of compounds has been produced. When using the appropriate sink, source and rules, this core algorithm can be used for a variety of applications beyond those it has been developed for. Results: Here, we showcase the use of the open source workflow RetroPath2.0. First, we mathematically prove that we can generate all structural isomers of a molecule using a reduced set of reaction rules. We then use this enumeration strategy to screen the chemical space around a set of monomers and predict their glass transition temperatures, as well as around aminoglycosides to search structures maximizing antibacterial activity. We also perform a screening around aminoglycosides with enzymatic reaction rules to ensure biosynthetic accessibility. We finally use our workflow on an E. coli model to complete E. coli metabolome, with novel molecules generated using promiscuous enzymatic reaction rules. These novel molecules are searched on the MS spectra of an E. coli cell lysate interfacing our workflow with OpenMS through the KNIME analytics platform. Conclusion: We provide an easy to use and modify, modular, and open-source workflow. We demonstrate its versatility through a variety of use cases including, molecular structure enumeration, virtual screening in the chemical space, and metabolome completion. Because it is open source and freely available on MyExperiment.org, workflow community contributions should likely expand further the features of the tool, even beyond the use cases presented in the paper.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter