Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
19
result(s) for
"Svozil, Daniel"
Sort by:
SYBA: Bayesian estimation of synthetic accessibility of organic compounds
by
Čmelo, Ivan
,
Kolář, Michal
,
Voršilák, Milan
in
Accessibility
,
Bayesian analysis
,
Bernoulli naïve Bayes
2020
SYBA (SYnthetic Bayesian Accessibility) is a fragment-based method for the rapid classification of organic compounds as easy- (ES) or hard-to-synthesize (HS). It is based on a Bernoulli naïve Bayes classifier that is used to assign SYBA score contributions to individual fragments based on their frequencies in the database of ES and HS molecules. SYBA was trained on ES molecules available in the ZINC15 database and on HS molecules generated by the Nonpher methodology. SYBA was compared with a random forest, that was utilized as a baseline method, as well as with other two methods for synthetic accessibility assessment: SAScore and SCScore. When used with their suggested thresholds, SYBA improves over random forest classification, albeit marginally, and outperforms SAScore and SCScore. However, upon the optimization of SAScore threshold (that changes from 6.0 to – 4.5), SAScore yields similar results as SYBA. Because SYBA is based merely on fragment contributions, it can be used for the analysis of the contribution of individual molecular parts to compound synthetic accessibility. SYBA is publicly available at
https://github.com/lich-uct/syba
under the GNU General Public License.
Journal Article
QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction
by
Bender, Andreas
,
Svozil, Daniel
,
Cortés-Ciriano, Isidro
in
Affinity
,
Affinity fingerprints
,
Big Data in Chemistry
2020
Affinity fingerprints report the activity of small molecules across a set of assays, and thus permit to gather information about the bioactivities of structurally dissimilar compounds, where models based on chemical structure alone are often limited, and model complex biological endpoints, such as human toxicity and in vitro cancer cell line sensitivity. Here, we propose to model in vitro compound activity using computationally predicted bioactivity profiles as compound descriptors. To this aim, we apply and validate a framework for the calculation of QSAR-derived affinity fingerprints (QAFFP) using a set of 1360 QSAR models generated using K
i
, K
d
, IC
50
and EC
50
data from ChEMBL database. QAFFP thus represent a method to encode and relate compounds on the basis of their similarity in bioactivity space. To benchmark the predictive power of QAFFP we assembled IC
50
data from ChEMBL database for 18 diverse cancer cell lines widely used in preclinical drug discovery, and 25 diverse protein target data sets. This study complements part 1 where the performance of QAFFP in similarity searching, scaffold hopping, and bioactivity classification is evaluated. Despite being inherently noisy, we show that using QAFFP as descriptors leads to errors in prediction on the test set in the ~ 0.65–0.95 pIC
50
units range, which are comparable to the estimated uncertainty of bioactivity data in ChEMBL (0.76–1.00 pIC
50
units). We find that the predictive power of QAFFP is slightly worse than that of Morgan2 fingerprints and 1D and 2D physicochemical descriptors, with an effect size in the 0.02–0.08 pIC
50
units range. Including QSAR models with low predictive power in the generation of QAFFP does not lead to improved predictive power. Given that the QSAR models we used to compute the QAFFP were selected on the basis of data availability alone, we anticipate better modeling results for QAFFP generated using more diverse and biologically meaningful targets. Data sets and Python code are publicly available at
https://github.com/isidroc/QAFFP_regression
.
Journal Article
InCHlib – interactive cluster heatmap for web applications
by
Bartůněk, Petr
,
Svozil, Daniel
,
Škuta, Ctibor
in
Application programming interface
,
Big data
,
Chemistry
2014
Background
Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called ‘cluster heatmap’ is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap.
Results
We developed
InCHlib
(Interactive Cluster Heatmap Library), a highly interactive and lightweight
JavaScript
library for cluster heatmap visualization and exploration.
InCHlib
enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for
InCHlib
is facilitated by the Python utility script
inchlib_clust
.
Conclusions
The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented
JavaScript
library
InCHlib
is a client-side solution for cluster heatmap exploration.
InCHlib
can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though
InCHlib
is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
Journal Article
Molpher: a software framework for systematic chemical space exploration
by
Voršilák, Milan
,
Svozil, Daniel
,
Škoda, Petr
in
Algorithms
,
Chemistry
,
Chemistry and Materials Science
2014
Background
Chemical space is virtual space occupied by all chemically meaningful organic compounds. It is an important concept in contemporary chemoinformatics research, and its systematic exploration is vital to the discovery of either novel drugs or new tools for chemical biology.
Results
In this paper, we describe Molpher, an open-source framework for the systematic exploration of chemical space. Through a process we term ‘molecular morphing’, Molpher produces a path of structurally-related compounds. This path is generated by the iterative application of so-called ‘morphing operators’ that represent simple structural changes, such as the addition or removal of an atom or a bond. Molpher incorporates an optimized parallel exploration algorithm, compound logging and a two-dimensional visualization of the exploration process. Its feature set can be easily extended by implementing additional morphing operators, chemical fingerprints, similarity measures and visualization methods. Molpher not only offers an intuitive graphical user interface, but also can be run in batch mode. This enables users to easily incorporate molecular morphing into their existing drug discovery pipelines.
Conclusions
Molpher is an open-source software framework for the design of virtual chemical libraries focused on a particular mechanistic class of compounds. These libraries, represented by a morphing path and its surroundings, provide valuable starting data for future
in silico
and
in vitro
experiments. Molpher is highly extensible and can be easily incorporated into any existing computational drug design pipeline.
Journal Article
Refinement of the AMBER Force Field for Nucleic Acids: Improving the Description of α/ γ Conformers
by
Orozco, Modesto
,
Svozil, Daniel
,
Marchán, Iván
in
Biophysical Theory and Modeling
,
Computer Simulation
,
DNA - chemistry
2007
We present here the parmbsc0 force field, a refinement of the AMBER parm99 force field, where emphasis has been made on the correct representation of the
α/
γ concerted rotation in nucleic acids (NAs). The modified force field corrects overpopulations of the
α/
γ
=
(g+,t) backbone that were seen in long (more than 10
ns) simulations with previous AMBER parameter sets (parm94-99). The force field has been derived by fitting to high-level quantum mechanical data and verified by comparison with very high-level quantum mechanical calculations and by a very extensive comparison between simulations and experimental data. The set of validation simulations includes two of the longest trajectories published to date for the DNA duplex (200
ns each) and the largest variety of NA structures studied to date (15 different NA families and 97 individual structures). The total simulation time used to validate the force field includes near 1
μs of state-of-the-art molecular dynamics simulations in aqueous solution.
Journal Article
Nonpher: computational method for design of hard-to-synthesize structures
2017
In cheminformatics, machine learning methods are typically used to classify chemical compounds into distinctive classes such as active/nonactive or toxic/nontoxic. To train a classifier, a training data set must consist of examples from both positive and negative classes. While a biological activity or toxicity can be experimentally measured, another important molecular property, a synthetic feasibility, is a more abstract feature that can’t be easily assessed. In the present paper, we introduce Nonpher, a computational method for the construction of a hard-to-synthesize virtual library. Nonpher is based on a molecular morphing algorithm in which new structures are iteratively generated by simple structural changes, such as the addition or removal of an atom or a bond. In Nonpher, molecular morphing was optimized so that it yields structures not overly complex, but just right hard-to-synthesize. Nonpher results were compared with SAscore and dense region (DR), other two methods for the generation of hard-to-synthesize compounds. Random forest classifier trained on Nonpher data achieves better results than models obtained using SAscore and DR data.
Journal Article
mmView: a web-based viewer of the mmCIF format
2011
Background
Structural biomolecular data are commonly stored in the PDB format. The PDB format is widely supported by software vendors because of its simplicity and readability. However, the PDB format cannot fully address many informatics challenges related to the growing amount of structural data. To overcome the limitations of the PDB format, a new textual format mmCIF was released in June 1997 in its version 1.0. mmCIF provides extra information which has the advantage of being in a computer readable form. However, this advantage becomes a disadvantage if a human must read and understand the stored data. While software tools exist to help to prepare mmCIF files, the number of available systems simplifying the comprehension and interpretation of the mmCIF files is limited.
Findings
In this paper we present mmView - a cross-platform web-based application that allows to explore comfortably the structural data of biomacromolecules stored in the mmCIF format. The mmCIF categories can be easily browsed in a tree-like structure, and the corresponding data are presented in a well arranged tabular form. The application also allows to display and investigate biomolecular structures via an integrated Java application Jmol.
Conclusions
The mmView software system is primarily intended for educational purposes, but it can also serve as a useful research tool. The mmView application is offered in two flavors: as an open-source stand-alone application (available from
http://sourceforge.net/projects/mmview
) that can be installed on the user's computer, and as a publicly available web server.
Journal Article