Catalogue Search | MBRL

ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics

by Ashby, Thomas J. , Georgiev, Ivan , Jeliazkov, Vedrin in Annotations , Big Data , Chemical compounds

2017

Chemogenomics data generally refers to the activity data of chemical compounds on an array of protein targets and represents an important source of information for building in silico target prediction models. The increasing volume of chemogenomics data offers exciting opportunities to build models based on Big Data. Preparing a high quality data set is a vital step in realizing this goal and this work aims to compile such a comprehensive chemogenomics dataset. This dataset comprises over 70 million SAR data points from publicly available databases (PubChem and ChEMBL) including structure, target information and activity annotations. Our aspiration is to create a useful chemogenomics resource reflecting industry-scale data not only for building predictive models of in silico polypharmacology and off-target effects but also for the validation of cheminformatics approaches in general.

Journal Article

Share this book

Add to My Shelf

Your Spreadsheets Can Be FAIR: A Tool and FAIRification Workflow for the eNanoMapper Database

by Ritchie, Peter , Iliev, Luchesar , Tancheva, Gergana in Bioassays , Configurations , Data entry

2020

The field of nanoinformatics is rapidly developing and provides data driven solutions in the area of nanomaterials (NM) safety. Safe by Design approaches are encouraged and promoted through regulatory initiatives and multiple scientific projects. Experimental data is at the core of nanoinformatics processing workflows for risk assessment. The nanosafety data is predominantly recorded in Excel spreadsheet files. Although the spreadsheets are quite convenient for the experimentalists, they also pose great challenges for the consequent processing into databases due to variability of the templates used, specific details provided by each laboratory and the need for proper metadata documentation and formatting. In this paper, we present a workflow to facilitate the conversion of spreadsheets into a FAIR (Findable, Accessible, Interoperable, and Reusable) database, with the pivotal aid of the NMDataParser tool, developed to streamline the mapping of the original file layout into the eNanoMapper semantic data model. The NMDataParser is an open source Java library and application, making use of a JSON configuration to define the mapping. We describe the JSON configuration syntax and the approaches applied for parsing different spreadsheet layouts used by the nanosafety community. Examples of using the NMDataParser tool in nanoinformatics workflows are given. Challenging cases are discussed and appropriate solutions are proposed.

Journal Article

Share this book

Add to My Shelf

The eNanoMapper database for nanomaterial safety information

by Munteanu, Cristian R , Hardy, Barry , Hegi, Markus in EU NanoSafety Cluster , Full Research Paper , nanoinformatics

2015

Background: The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. Results: The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. Conclusion: We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the “representational state transfer” (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure–activity relationships for nanomaterials (NanoQSAR).

Journal Article

Share this book

Add to My Shelf

AMBIT RESTful web services: an implementation of the OpenTox application programming interface

by Jeliazkov, Vedrin , Jeliazkova, Nina in Algorithms , Applications programming , Chemistry

2011

The AMBIT web services package is one of the several existing independent implementations of the OpenTox Application Programming Interface and is built according to the principles of the Representational State Transfer (REST) architecture. The Open Source Predictive Toxicology Framework, developed by the partners in the EC FP7 OpenTox project, aims at providing a unified access to toxicity data and predictive models, as well as validation procedures. This is achieved by i) an information model, based on a common OWL-DL ontology ii) links to related ontologies; iii) data and algorithms, available through a standardized REST web services interface, where every compound, data set or predictive method has a unique web address, used to retrieve its Resource Description Framework (RDF) representation, or initiate the associated calculations. The AMBIT web services package has been developed as an extension of AMBIT modules, adding the ability to create (Quantitative) Structure-Activity Relationship (QSAR) models and providing an OpenTox API compliant interface. The representation of data and processing resources in W3C Resource Description Framework facilitates integrating the resources as Linked Data. By uploading datasets with chemical structures and arbitrary set of properties, they become automatically available online in several formats. The services provide unified interfaces to several descriptor calculation, machine learning and similarity searching algorithms, as well as to applicability domain and toxicity prediction models. All Toxtree modules for predicting the toxicological hazard of chemical compounds are also integrated within this package. The complexity and diversity of the processing is reduced to the simple paradigm \"read data from a web address, perform processing, write to a web address\". The online service allows to easily run predictions, without installing any software, as well to share online datasets and models. The downloadable web application allows researchers to setup an arbitrary number of service instances for specific purposes and at suitable locations. These services could be used as a distributed framework for processing of resource-intensive tasks and data sharing or in a fully independent way, according to the specific needs. The advantage of exposing the functionality via the OpenTox API is seamless interoperability, not only within a single web application, but also in a network of distributed services. Last, but not least, the services provide a basis for building web mashups, end user applications with friendly GUIs, as well as embedding the functionalities in existing workflow systems.

Journal Article

Share this book

Add to My Shelf

METER.AC: Live Open Access Atmospheric Monitoring Data for Bulgaria with High Spatiotemporal Resolution

by Tenev, Stoyan , Terziyski, Atanas , Jeliazkova, Nina in atmosphere , Atmospheric pressure , Background radiation

2020

Detailed atmospheric monitoring data are notoriously difficult to obtain for some geographic regions, while they are of paramount importance in scientific research, forecasting, emergency response, policy making, etc. We describe a continuously updated dataset, METER.AC, consisting of raw measurements of atmospheric pressure, temperature, relative humidity, particulate matter, and background radiation in about 100 locations in Bulgaria, as well as some derived values such as sea-level atmospheric pressure, dew/frost point, and hourly trends. The measurements are performed by low-power maintenance-free nodes with common hardware and software, which are specifically designed and optimized for this purpose. The time resolution of the measurements is 5 min. The short-term aim is to deploy at least one node per 100 km2, while uniformly covering altitudes between 0 and 3000 m asl with a special emphasis on remote mountainous areas. A full history of all raw measurements (non-aggregated in time and space) is publicly available, starting from September 2018. We describe the basic technical characteristics of our in-house developed equipment, data organization, and communication protocols as well as present some use case examples. The METER.AC network relies on the paradigm of the Internet of Things (IoT), by collecting data from various gauges. A guiding principle in this work is the provision of findable, accessible, interoperable, and reusable (FAIR) data. The dataset is in the public domain, and it provides resources and tools enabling citizen science development in the context of sustainable development.

Journal Article

Share this book

Add to My Shelf

Collaborative development of predictive toxicology applications

by Gütlein, Martin , Hardy, Barry , Benigni, Romualdo in Algorithms , Applications programming , Chemistry

2010

OpenTox provides an interoperable, standards-based Framework for the support of predictive toxicology data management, algorithms, modelling, validation and reporting. It is relevant to satisfying the chemical safety assessment requirements of the REACH legislation as it supports access to experimental data, (Quantitative) Structure-Activity Relationship models, and toxicological information through an integrating platform that adheres to regulatory requirements and OECD validation principles. Initial research defined the essential components of the Framework including the approach to data access, schema and management, use of controlled vocabularies and ontologies, architecture, web service and communications protocols, and selection and integration of algorithms for predictive modelling. OpenTox provides end-user oriented tools to non-computational specialists, risk assessors, and toxicological experts in addition to Application Programming Interfaces (APIs) for developers of new applications. OpenTox actively supports public standards for data representation, interfaces, vocabularies and ontologies, Open Source approaches to core platform components, and community-based collaboration approaches, so as to progress system interoperability goals. The OpenTox Framework includes APIs and services for compounds, datasets, features, algorithms, models, ontologies, tasks, validation, and reporting which may be combined into multiple applications satisfying a variety of different user needs. OpenTox applications are based on a set of distributed, interoperable OpenTox API-compliant REST web services. The OpenTox approach to ontology allows for efficient mapping of complementary data coming from different datasets into a unifying structure having a shared terminology and representation. Two initial OpenTox applications are presented as an illustration of the potential impact of OpenTox for high-quality and consistent structure-activity relationship modelling of REACH-relevant endpoints: ToxPredict which predicts and reports on toxicities for endpoints for an input chemical structure, and ToxCreate which builds and validates a predictive toxicity model based on an input toxicology dataset. Because of the extensible nature of the standardised Framework design, barriers of interoperability between applications and content are removed, as the user may combine data, models and validation from multiple sources in a dependable and time-effective way.

Journal Article

Share this book

Add to My Shelf

Erratum to: ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics

by Ashby, Thomas J. , Georgiev, Ivan , Jeliazkov, Vedrin in Chemistry , Chemistry and Materials Science , Computational Biology/Bioinformatics

2017

Journal Article

Share this book

Add to My Shelf

Towards FAIR nanosafety data

by Gómez-Fernández, Paloma , Ritchie, Peter , Jacobsen, Nicklas Raun in 639/925/928 , 639/925/928/1066 , 639/925/928/1071

2021

Nanotechnology is a key enabling technology with billions of euros in global investment from public funding, which include large collaborative projects that have investigated environmental and health safety aspects of nanomaterials, but the reuse of accumulated data is clearly lagging behind. Here we summarize challenges and provide recommendations for the efficient reuse of nanosafety data, in line with the recently established FAIR (findable, accessible, interoperable and reusable) guiding principles. We describe the FAIR-aligned Nanosafety Data Interface, with an aggregated findability, accessibility and interoperability across physicochemical, bio–nano interaction, human toxicity, omics, ecotoxicological and exposure data. Overall, we illustrate a much-needed path towards standards for the optimized use of existing data, which avoids duplication of efforts, and provides a multitude of options to promote safe and sustainable nanotechnology. The proposal of a FAIR-aligned Nanosafety Data Interface can advance findability, accessibility and interoperability across physicochemical, bio–nano interaction, human toxicity, omics, ecotoxicological and exposure data.

Journal Article

Share this book

Add to My Shelf

A template wizard for the cocreation of machine-readable data-reporting to harmonize the evaluation of (nano)materials

by Serchi, Tommaso , D Apostolova, Margarita , Kochev, Nikolay in 631/114/1314 , 631/114/2401 , Analytical Chemistry

2024

Making research data findable, accessible, interoperable and reusable (FAIR) is typically hampered by a lack of skills in technical aspects of data management by data generators and a lack of resources. We developed a Template Wizard for researchers to easily create templates suitable for consistently capturing data and metadata from their experiments. The templates are easy to use and enable the compilation of machine-readable metadata to accompany data generation and align them to existing community standards and databases, such as eNanoMapper, streamlining the adoption of the FAIR principles. These templates are citable objects and are available as online tools. The Template Wizard is designed to be user friendly and facilitates using and reusing existing templates for new projects or project extensions. The wizard is accompanied by an online template validator, which allows self-evaluation of the template (to ensure mapping to the data schema and machine readability of the captured data) and transformation by an open-source parser into machine-readable formats, compliant with the FAIR principles. The templates are based on extensive collective experience in nanosafety data collection and include over 60 harmonized data entry templates for physicochemical characterization and hazard assessment (cell viability, genotoxicity, environmental organism dose-response tests, omics), as well as exposure and release studies. The templates are generalizable across fields and have already been extended and adapted for microplastics and advanced materials research. The harmonized templates improve the reliability of interlaboratory comparisons, data reuse and meta-analyses and can facilitate the safety evaluation and regulation process for (nano) materials. Key points The wizard facilitates the capture of experimental metadata and data via community-agreed templates, ensuring that data types generated by different instruments are linked (spectrometers, flow cytometers, microscopes/plate readers, etc.). SOPs and experimental workflows are also hyperlinked to the templates, supporting data harmonization and interoperability. The Template Wizard was evolved with insight and experience gathered over a decade of EU FP7 and H2020 projects addressing nanoscale materials safety. Community-generated online templates for harmonized data reporting ensure that data and metadata associated with experiments are findable, accessible, interoperable, reusable and compiled for consistency in experimental design and test performance.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter