Catalogue Search | MBRL

Visualising lead optimisation series using reduced graphs

by Gillet, Valerie J. , Pickett, Stephen D. , Stacey, Jessica in Automation , Chemistry , Chemistry and Materials Science

2025

The typical way in which lead optimisation (LO) series are represented in the medicinal chemistry literature is as Markush structures and associated R-group tables. The Markush structure shows a central core or molecular scaffold that is common to the series with R groups that indicate the points of variability that have been explored in the series. The associated R-group table shows the substituent combinations that exist in individual molecules in the series together with properties of those compounds. This format provides an intuitive way of visualising any structure–activity relationship (SAR) that is present. Automated approaches that attempt to reproduce this well understood format, such as the SAR map, are based on maximum common substructure approaches and do not take account of small changes that may be made to the core structure itself or of the situation where more than one core exists in the data. Here we describe an automated approach to represent LO series that is based on reduced graph descriptions of molecules. A publicly available LO dataset from a drug discovery programme at GSK is analysed to show how the method can group together compounds from the same series even when there are small substructural differences within the core of the series while also being able to identify different related compound series. The resulting visualisation is useful in identifying areas where series are under explored and for mapping design ideas onto the current dataset. The code to generate the visualisations is released into the public domain to promote further research in this area. Scientific contribution : We describe a software tool for analysing lead optimisation series using reduced graph representations of molecules. The representation allows compounds that have similar but not identical chemical scaffolds to be grouped together and is, therefore, an advance on methods that are based on the more traditional Markush structure and SAR tables. The software is a useful addition to the med chem toolbox as it can provide a holistic view of lead optimisation data by representing what might otherwise be seen as separate series as a single series of compounds.

Journal Article

Share this book

Add to My Shelf

Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction

by Gillet, Valerie J. , Allen, Luke N. , Webb, Samuel J. in Analysis , Assaying , Biocompatibility

2022

Recently, imputation techniques have been adapted to predict activity values among sparse bioactivity matrices, showing improvements in predictive performance over traditional QSAR models. These models are able to use experimental activity values for auxiliary assays when predicting the activity of a test compound on a specific assay. In this study, we tested three different multi-task imputation techniques on three classification-based toxicity datasets: two of small scale (12 assays each) and one large scale with 417 assays. Moreover, we analyzed in detail the improvements shown by the imputation models. We found that test compounds that were dissimilar to training compounds, as well as test compounds with a large number of experimental values for other assays, showed the largest improvements. We also investigated the impact of sparsity on the improvements seen as well as the relatedness of the assays being considered. Our results show that even a small amount of additional information can provide imputation methods with a strong boost in predictive performance over traditional single task and multi-task predictive models.

Journal Article

Share this book

Add to My Shelf

Effect of missing data on multitask prediction methods

by Gillet, Valerie J. , de la Vega de León, Antonio , Chen, Beining in Analysis , Artificial neural networks , Bayesian analysis

2018

There has been a growing interest in multitask prediction in chemoinformatics, helped by the increasing use of deep neural networks in this field. This technique is applied to multitarget data sets, where compounds have been tested against different targets, with the aim of developing models to predict a profile of biological activities for a given compound. However, multitarget data sets tend to be sparse; i.e., not all compound-target combinations have experimental values. There has been little research on the effect of missing data on the performance of multitask methods. We have used two complete data sets to simulate sparseness by removing data from the training set. Different models to remove the data were compared. These sparse sets were used to train two different multitask methods, deep neural networks and Macau, which is a Bayesian probabilistic matrix factorization technique. Results from both methods were remarkably similar and showed that the performance decrease because of missing data is at first small before accelerating after large amounts of data are removed. This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises.

Journal Article

Share this book

Add to My Shelf

Identification of compounds that rescue otic and myelination defects in the zebrafish adgrg6 (gpr126) mutant

by Asad, Anzar , Diamantopoulou, Elvira , de la Vega de León, Antonio in Adgrg6 (Gpr126) , adhesion GPCR , Alleles

2019

Adgrg6 (Gpr126) is an adhesion class G protein-coupled receptor with a conserved role in myelination of the peripheral nervous system. In the zebrafish, mutation of adgrg6 also results in defects in the inner ear: otic tissue fails to down-regulate versican gene expression and morphogenesis is disrupted. We have designed a whole-animal screen that tests for rescue of both up- and down-regulated gene expression in mutant embryos, together with analysis of weak and strong alleles. From a screen of 3120 structurally diverse compounds, we have identified 68 that reduce versican b expression in the adgrg6 mutant ear, 41 of which also restore myelin basic protein gene expression in Schwann cells of mutant embryos. Nineteen compounds unable to rescue a strong adgrg6 allele provide candidates for molecules that may interact directly with the Adgrg6 receptor. Our pipeline provides a powerful approach for identifying compounds that modulate GPCR activity, with potential impact for future drug design.

Journal Article

Share this book

Add to My Shelf

A comparison of the pharmacophore identification programs: Catalyst, DISCO and GASP

by Gillet, Valerie J. , Bravi, Gianpaolo , Leach, Andrew R. in Algorithms , Catalysts , CDC2-CDC28 Kinases

2002

Three commercially available pharmacophore generation programs, Catalyst/HipHop, DISCO and GASP, were compared on their ability to generate known pharmacophores deduced from protein-ligand complexes extracted from the Protein Data Bank. Five different protein families were included Thrombin, Cyclin Dependent Kinase 2, Dihydrofolate Reductase, HIV Reverse Transcriptase and Thermolysin. Target pharmacophores were defined through visual analysis of the data sets. The pharmacophore models produced were evaluated qualitatively through visual inspection and according to their ability to generate the target pharmacophores. Our results show that GASP and Catalyst outperformed DISCO at reproducing the five target pharmacophores.

Journal Article

Share this book

Add to My Shelf

Enhancing reaction-based de novo design using a multi-label reaction class recommender

by Wallace James E A , Ghiandoni, Gian Marco , Webster, James in Accessibility , Algorithms , Chemical reactions

2020

Reaction-based de novo design refers to the in-silico generation of novel chemical structures by combining reagents using structural transformations derived from known reactions. The driver for using reaction-based transformations is to increase the likelihood of the designed molecules being synthetically accessible. We have previously described a reaction-based de novo design method based on reaction vectors which are transformation rules that are encoded automatically from reaction databases. A limitation of reaction vectors is that they account for structural changes that occur at the core of a reaction only, and they do not consider the presence of competing functionalities that can compromise the reaction outcome. Here, we present the development of a Reaction Class Recommender to enhance the reaction vector framework. The recommender is intended to be used as a filter on the reaction vectors that are applied during de novo design to reduce the combinatorial explosion of in-silico molecules produced while limiting the generated structures to those which are most likely to be synthesisable. The recommender has been validated using an external data set extracted from the recent medicinal chemistry literature and in two simulated de novo design experiments. Results suggest that the use of the recommender drastically reduces the number of solutions explored by the algorithm while preserving the chance of finding relevant solutions and increasing the global synthetic accessibility of the designed molecules.

Journal Article

Share this book

Add to My Shelf

Development and validation of an improved algorithm for overlaying flexible molecules

by Gillet, Valerie J. , Gardiner, Eleanor J. , Taylor, Robin in Algorithms , Animal Anatomy , Chemistry

2012

A program for overlaying multiple flexible molecules has been developed. Candidate overlays are generated by a novel fingerprint algorithm, scored on three objective functions (union volume, hydrogen-bond match, and hydrophobic match), and ranked by constrained Pareto ranking. A diverse subset of the best ranked solutions is chosen using an overlay-dissimilarity metric. If necessary, the solutions can be optimised. A multi-objective genetic algorithm can be used to find additional overlays with a given mapping of chemical features but different ligand conformations. The fingerprint algorithm may also be used to produce constrained overlays, in which user-specified chemical groups are forced to be superimposed. The program has been tested on several sets of ligands, for each of which the true overlay is known from protein–ligand crystal structures. Both objective and subjective success criteria indicate that good results are obtained on the majority of these sets.

Journal Article

Share this book

Add to My Shelf

Incorporating partial matches within multiobjective pharmacophore identification

by Gillet, Valerie J. , Cottrell, Simon J. , Taylor, Robin in Genetic algorithms , Studies

2006

Issue Title: Advances in Pharmacophores and 3-D Screening This paper describes the extension of our earlier multiobjective method for generating plausible pharmacophore hypotheses to incorporate partial matches. Diverse sets of molecules rarely adopt exactly the same binding mode, and so allowing the identification of partial matches allows our program to be applied to larger and more diverse datasets. The method explores the conformational space of a series of ligands simultaneously with their alignment using a multiobjective genetic algorithm (MOGA). The principles of Pareto ranking are used to evolve a diverse set of pharmacophore hypotheses that are optimised on conformational energy of the ligands, the goodness of the overlay and the volume of the overlay. A partial match is defined as a pharmacophoric feature that is present in at least two, but not all, of the ligands in the set. The number of ligands that map to a given pharmacophore point is taken into account when evaluating an overlay. The method is applied to a number of test cases extracted from the Protein Data Bank (PDB) where the true overlay is known.[PUBLICATION ABSTRACT]

Journal Article

Share this book

Add to My Shelf

Generation of multiple pharmacophore hypotheses using multiobjective optimisation techniques

by Gillet, Valerie J. , Cottrell, Simon J. , Wilton, David J. in Algorithms , Binding Sites , Computer-Aided Design

2004

Pharmacophore methods provide a way of establishing a structure activity relationship for a series of known active ligands. Often, there are several plausible hypotheses that could explain the same set of ligands and, in such cases, it is important that the chemist is presented with alternatives that can be tested with different synthetic compounds. Existing pharmacophore methods involve either generating an ensemble of conformers and considering each conformer of each ligand in turn or exploring conformational space on-the-fly. The ensemble methods tend to produce a large number of hypotheses and require considerable effort to analyse the results, whereas methods that vary conformation on-the-fly typically generate a single solution that represents one possible hypothesis, even though several might exist. We describe a new method for generating multiple pharmacophore hypotheses with full conformational flexibility being explored on-the-fly. The method is based on multiobjective evolutionary algorithm techniques and is designed to search for an ensemble of diverse yet plausible overlays which can then be presented to the chemist for further investigation.

Journal Article

Share this book

Add to My Shelf

Reviews in computational chemistry

by Cundari, Thomas R , Gillet, Valerie J , Boyd, Donald B in Chemistry

2006

FROM REVIEWS OF THE SERIES \"Reviews in Computational Chemistry remains the most valuable reference to methods and techniques in computational chemistry.\" -JOURNAL OF MOLECULAR GRAPHICS AND MODELLING \"One cannot generally do better than to try to find an appropriate article in the highly successful Reviews in Computational Chemistry. The basic philosophy of the editors seems to be to help the authors produce chapters that are complete, accurate, clear, and accessible to experimentalists (in particular) and other nonspecialists (in general).\" -JOURNAL OF THE AMERICAN CHEMICAL SOCIETY

eBook

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter