Catalogue Search | MBRL

Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set

by van Vlijmen, Herman W. T. , Papadatos, George , Kowalczyk, Wojtek in 7th Joint Sheffield Conference on Cheminformatics , Artificial neural networks , Bayesian analysis

2017

The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method (‘DNN_PCM’) performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized ‘DNN_PCM’). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .

Journal Article

Share this book

Add to My Shelf

DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology

by van Vlijmen, Herman W. T. , Liu, Xuhan , IJzerman, Adriaan P. in Acceleration , Adenosine , Adenosine receptors

2021

In polypharmacology drugs are required to bind to multiple specific targets, for example to enhance efficacy or to reduce resistance formation. Although deep learning has achieved a breakthrough in de novo design in drug discovery, most of its applications only focus on a single drug target to generate drug-like active molecules. However, in reality drug molecules often interact with more than one target which can have desired (polypharmacology) or undesired (toxicity) effects. In a previous study we proposed a new method named DrugEx that integrates an exploration strategy into RNN-based reinforcement learning to improve the diversity of the generated molecules. Here, we extended our DrugEx algorithm with multi-objective optimization to generate drug-like molecules towards multiple targets or one specific target while avoiding off-targets (the two adenosine receptors, A 1 AR and A 2A AR, and the potassium ion channel hERG in this study). In our model, we applied an RNN as the agent and machine learning predictors as the environment . Both the agent and the environment were pre-trained in advance and then interplayed under a reinforcement learning framework. The concept of evolutionary algorithms was merged into our method such that crossover and mutation operations were implemented by the same deep learning model as the agent . During the training loop, the agent generates a batch of SMILES-based molecules. Subsequently scores for all objectives provided by the environment are used to construct Pareto ranks of the generated molecules. For this ranking a non-dominated sorting algorithm and a Tanimoto-based crowding distance algorithm using chemical fingerprints are applied. Here, we adopted GPU acceleration to speed up the process of Pareto optimization. The final reward of each molecule is calculated based on the Pareto ranking with the ranking selection algorithm. The agent is trained under the guidance of the reward to make sure it can generate desired molecules after convergence of the training process. All in all we demonstrate generation of compounds with a diverse predicted selectivity profile towards multiple targets, offering the potential of high efficacy and low toxicity.

Journal Article

Share this book

Add to My Shelf

An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A2A receptor

by van Vlijmen, Herman W. T. , Liu, Xuhan , IJzerman, Adriaan P. in Adenosine , Adenosine receptors , Analysis

2019

Over the last 5 years deep learning has progressed tremendously in both image recognition and natural language processing. Now it is increasingly applied to other data rich fields. In drug discovery, recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES. However, ligands generated by current methods have so far provided relatively low diversity and do not fully cover the whole chemical space occupied by known ligands. Here, we propose a new method (DrugEx) to discover de novo drug-like molecules. DrugEx is an RNN model (generator) trained through reinforcement learning which was integrated with a special exploration strategy. As a case study we applied our method to design ligands against the adenosine A 2A receptor. From ChEMBL data, a machine learning model (predictor) was created to predict whether generated molecules are active or not. Based on this predictor as the reward function, the generator was trained by reinforcement learning without any further data. We then compared the performance of our method with two previously published methods, REINVENT and ORGANIC. We found that candidate molecules our model designed, and predicted to be active, had a larger chemical diversity and better covered the chemical space of known ligands compared to the state-of-the-art.

Journal Article

Share this book

Add to My Shelf

CalcAMP: A New Machine Learning Model for the Accurate Prediction of Antimicrobial Activity of Peptides

by Cordfunke, Robert A. , van Leeuwen, Remko , Riool, Martijn in Algorithms , Amino acids , Antibiotics

2023

To combat infection by microorganisms host organisms possess a primary arsenal via the innate immune system. Among them are defense peptides with the ability to target a wide range of pathogenic organisms, including bacteria, viruses, parasites, and fungi. Here, we present the development of a novel machine learning model capable of predicting the activity of antimicrobial peptides (AMPs), CalcAMP. AMPs, in particular short ones (<35 amino acids), can become an effective solution to face the multi-drug resistance issue arising worldwide. Whereas finding potent AMPs through classical wet-lab techniques is still a long and expensive process, a machine learning model can be useful to help researchers to rapidly identify whether peptides present potential or not. Our prediction model is based on a new data set constructed from the available public data on AMPs and experimental antimicrobial activities. CalcAMP can predict activity against both Gram-positive and Gram-negative bacteria. Different features either concerning general physicochemical properties or sequence composition have been assessed to retrieve higher prediction accuracy. CalcAMP can be used as an promising prediction asset to identify short AMPs among given peptide sequences.

Journal Article

Share this book

Add to My Shelf

DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning

by van Vlijmen, Herman W. T. , Liu, Xuhan , IJzerman, Adriaan P. in Adenosine , Analysis , Applications

2023

Rational drug design often starts from specific scaffolds to which side chains/substituents are added or modified due to the large drug-like chemical space available to search for novel drug-like molecules. With the rapid growth of deep learning in drug discovery, a variety of effective approaches have been developed for de novo drug design. In previous work we proposed a method named DrugEx , which can be applied in polypharmacology based on multi-objective deep reinforcement learning. However, the previous version is trained under fixed objectives and does not allow users to input any prior information ( i.e. a desired scaffold). In order to improve the general applicability, we updated DrugEx to design drug molecules based on scaffolds which consist of multiple fragments provided by users. Here, a Transformer model was employed to generate molecular structures. The Transformer is a multi-head self-attention deep learning model containing an encoder to receive scaffolds as input and a decoder to generate molecules as output. In order to deal with the graph representation of molecules a novel positional encoding for each atom and bond based on an adjacency matrix was proposed, extending the architecture of the Transformer. The graph Transformer model contains growing and connecting procedures for molecule generation starting from a given scaffold based on fragments. Moreover, the generator was trained under a reinforcement learning framework to increase the number of desired ligands. As a proof of concept, the method was applied to design ligands for the adenosine A 2A receptor (A 2A AR) and compared with SMILES-based methods. The results show that 100% of the generated molecules are valid and most of them had a high predicted affinity value towards A 2A AR with given scaffolds.

Journal Article

Share this book

Add to My Shelf

Quantitative prediction of selectivity between the A1 and A2A adenosine receptors

by van Vlijmen, Herman W. T. , IJzerman, Adriaan P. , Burggraaff, Lindsey in A1 adenosine receptor , A2A adenosine receptor , Adenosine

2020

The development of drugs is often hampered due to off-target interactions leading to adverse effects. Therefore, computational methods to assess the selectivity of ligands are of high interest. Currently, selectivity is often deduced from bioactivity predictions of a ligand for multiple targets (individual machine learning models). Here we show that modeling selectivity directly, by using the affinity difference between two drug targets as output value, leads to more accurate selectivity predictions. We test multiple approaches on a dataset consisting of ligands for the A 1 and A 2A adenosine receptors (among others classification, regression, and we define different selectivity classes). Finally, we present a regression model that predicts selectivity between these two drug targets by directly training on the difference in bioactivity, modeling the selectivity-window. The quality of this model was good as shown by the performances for fivefold cross-validation: ROC A 1 AR-selective 0.88 ± 0.04 and ROC A 2A AR-selective 0.80 ± 0.07. To increase the accuracy of this selectivity model even further, inactive compounds were identified and removed prior to selectivity prediction by a combination of statistical models and structure-based docking. As a result, selectivity between the A 1 and A 2A adenosine receptors was predicted effectively using the selectivity-window model. The approach presented here can be readily applied to other selectivity cases.

Journal Article

Share this book

Add to My Shelf

Machine Learning-Identified Potent Antimicrobial Peptides Against Multidrug-Resistant Bacteria and Skin Infections

by Cordfunke, Robert A. , Riool, Martijn , Nibbering, Peter H. in 3D human epidermal model , Amino acids , Antibiotic resistance

2025

Background: The escalating global crisis of antibiotic resistance necessitates the discovery of novel antimicrobial agents. Antimicrobial peptides (AMPs) represent a promising alternative to combat multidrug-resistant (MDR) pathogens. Because traditional AMP discovery is labour-intensive and costly, machine learning (ML) is applied to identify AMPs effective against MDR bacteria and skin infections. Methods: The ML-based CalcAMP model predicts the antimicrobial activity of 16,384 unique 14-amino-acid peptide sequences, resulting in a novel Guided Designed Smart antimicrobial Therapeutic (GDST) peptide catalogue. Parent sequences and retro-inverso (RI) variants of two prime GDST peptides undergo extensive testing against MDR bacteria and in skin infection models. Results: GDST-038 and GDST-045, along with their RI variants, show potent antimicrobial activity against Acinetobacter baumannii and Staphylococcus aureus, rapidly depolarizing the cytoplasmic membrane, exhibiting broad-spectrum bactericidal effects against ESKAPE pathogens, and causing minimal haemolysis. RI variants display superior A. baumannii biofilm killing compared to parent sequences, while all GDST peptides achieve >3-log reductions in S. aureus biofilm CFU within 24 h. Potent efficacy is observed in a 3D human skin epidermal infection model, with elimination of S. aureus at ≥15 μM. No resistance develops after 22 passages. Conclusions: ML-driven screening enables rapid identification of two novel candidate AMPs, highlighting the therapeutic potential of GDST peptides for MDR bacterial infections.

Journal Article

Share this book

Add to My Shelf

Reduced hepatitis B and D viral entry using clinically applied drugs as novel inhibitors of the bile acid transporter NTCP

by Donkers, Joanne M. , Beuers, Ulrich , Oude Elferink, Ronald P. J. in 13/109 , 14/19 , 14/35

2017

The sodium taurocholate co-transporting polypeptide (NTCP, SLC10A1 ) is the main hepatic transporter of conjugated bile acids, and the entry receptor for hepatitis B virus (HBV) and hepatitis delta virus (HDV). Myrcludex B, a synthetic peptide mimicking the NTCP-binding domain of HBV, effectively blocks HBV and HDV infection. In addition, Myrcludex B inhibits NTCP-mediated bile acid uptake, suggesting that also other NTCP inhibitors could potentially be a novel treatment of HBV/HDV infection. This study aims to identify clinically-applied compounds intervening with NTCP-mediated bile acid transport and HBV/HDV infection. 1280 FDA/EMA-approved drugs were screened to identify compounds that reduce uptake of taurocholic acid and lower Myrcludex B-binding in U2OS cells stably expressing human NTCP. HBV/HDV viral entry inhibition was studied in HepaRG cells. The four most potent inhibitors of human NTCP were rosiglitazone (IC 50 5.1 µM), zafirlukast (IC 50 6.5 µM), TRIAC (IC 50 6.9 µM), and sulfasalazine (IC 50 9.6 µM). Chicago sky blue 6B (IC 50 7.1 µM) inhibited both NTCP and ASBT, a distinct though related bile acid transporter. Rosiglitazone, zafirlukast, TRIAC, sulfasalazine, and chicago sky blue 6B reduced HBV/HDV infection in HepaRG cells in a dose-dependent manner. Five out of 1280 clinically approved drugs were identified that inhibit NTCP-mediated bile acid uptake and HBV/HDV infection in vitro .

Journal Article

Share this book

Add to My Shelf

Improved Translational Relevance of In Vitro Fibrosis Models by Integrating IOX2-Mediated Hypoxia-Mimicking Pathways

by Venhorst, Jennifer , Caspers, Martien P. M. , Verschuren, Lars in Angiogenesis , Cell culture , Cell viability

2025

Background/Objectives: Preclinical models of liver fibrosis only partially mimic human disease processes. Particularly, traditional transforming growth factor beta 1 (TGFβ1)-induced hepatic stellate cell (HSC) models lack relevant processes, including hypoxia-induced pathways. Here, the ability of a hypoxia-mimicking compound (IOX2) to more accurately reflect the human fibrotic phenotype on a functional level was investigated. Methods: Human primary HSCs were stimulated (TGFβ1 +/− IOX2), and the cell viability and fibrotic phenotype were determined. The latter was assessed as protein levels of fibrosis markers—collagen, TIMP-1, and Fibronectin. Next-generation sequencing (NGS), differential expression analyses (DESeq2), and Ingenuity Pathway Analysis (IPA) were performed for mechanistic evaluation and biological annotation. Results: Stimulation with TGFβ1 + IOX2 significantly increased fibrotic marker levels. Also, fibrosis-related pathways were activated, and hypoxia-related genes and collagen modifications, such as crosslinking, increased dose-dependently. Comparative analysis with human fibrotic DEGs showed improved disease representation in the HSC model in the presence of IOX2. Conclusions: In conclusion, the HSC model better recapitulated liver fibrosis by IOX2 administration. Therefore, hypoxia-mimicking compounds hold promise for enhancing the translational value of in vitro fibrosis models, providing valuable insights in liver fibrosis pathogenesis and potential therapeutic strategies.

Journal Article

Share this book

Add to My Shelf

UnCorrupt SMILES: a novel approach to de novo design

by Schoenmaker, Linde , Jespers, Willem , van Westen, Gerard J. P. in Analog generation , Chemistry , Chemistry and Materials Science

2023

Generative deep learning models have emerged as a powerful approach for de novo drug design as they aid researchers in finding new molecules with desired properties. Despite continuous improvements in the field, a subset of the outputs that sequence-based de novo generators produce cannot be progressed due to errors. Here, we propose to fix these invalid outputs post hoc. In similar tasks, transformer models from the field of natural language processing have been shown to be very effective. Therefore, here this type of model was trained to translate invalid Simplified Molecular-Input Line-Entry System (SMILES) into valid representations. The performance of this SMILES corrector was evaluated on four representative methods of de novo generation: a recurrent neural network (RNN), a target-directed RNN, a generative adversarial network (GAN), and a variational autoencoder (VAE). This study has found that the percentage of invalid outputs from these specific generative models ranges between 4 and 89%, with different models having different error-type distributions. Post hoc correction of SMILES was shown to increase model validity. The SMILES corrector trained with one error per input alters 60–90% of invalid generator outputs and fixes 35–80% of them. However, a higher error detection and performance was obtained for transformer models trained with multiple errors per input. In this case, the best model was able to correct 60–95% of invalid generator outputs. Further analysis showed that these fixed molecules are comparable to the correct molecules from the de novo generators based on novelty and similarity. Additionally, the SMILES corrector can be used to expand the amount of interesting new molecules within the targeted chemical space. Introducing different errors into existing molecules yields novel analogs with a uniqueness of 39% and a novelty of approximately 20%. The results of this research demonstrate that SMILES correction is a viable post hoc extension and can enhance the search for better drug candidates. Graphical Abstract

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter