Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
83 result(s) for "Data Curation - standards"
Sort by:
Metabolite discovery through global annotation of untargeted metabolomics data
Liquid chromatography–high-resolution mass spectrometry (LC-MS)-based metabolomics aims to identify and quantify all metabolites, but most LC-MS peaks remain unidentified. Here we present a global network optimization approach, NetID, to annotate untargeted LC-MS metabolomics data. The approach aims to generate, for all experimentally observed ion peaks, annotations that match the measured masses, retention times and (when available) tandem mass spectrometry fragmentation patterns. Peaks are connected based on mass differences reflecting adduction, fragmentation, isotopes, or feasible biochemical transformations. Global optimization generates a single network linking most observed ion peaks, enhances peak assignment accuracy, and produces chemically informative peak–peak relationships, including for peaks lacking tandem mass spectrometry spectra. Applying this approach to yeast and mouse data, we identified five previously unrecognized metabolites (thiamine derivatives and N-glucosyl-taurine). Isotope tracer studies indicate active flux through these metabolites. Thus, NetID applies existing metabolomic knowledge and global optimization to substantially improve annotation coverage and accuracy in untargeted metabolomics datasets, facilitating metabolite discovery.The NetID algorithm annotates untargeted LC-MS metabolomics data by combining known biochemical and metabolomic principles with a global network optimization strategy.
Credit data generators for data reuse
To promote effective sharing, we must create an enduring link between the people who generate data and its future uses, urge Heather H. Pierce and colleagues. To promote effective sharing, we must create an enduring link between the people who generate data and its future uses, urge Heather H. Pierce and colleagues. Fluorescence immunohistochemistry and confocal microscopy images of normal and cancerous human tissue samples
Open data and digital morphology
Over the past two decades, the development of methods for visualizing and analysing specimens digitally, in three and even four dimensions, has transformed the study of living and fossil organisms. However, the initial promise that the widespread application of such methods would facilitate access to the underlying digital data has not been fully achieved. The underlying datasets for many published studies are not readily or freely available, introducing a barrier to verification and reproducibility, and the reuse of data. There is no current agreement or policy on the amount and type of data that should be made available alongside studies that use, and in some cases are wholly reliant on, digital morphology. Here, we propose a set of recommendations for minimum standards and additional best practice for three-dimensional digital data publication, and review the issues around data storage, management and accessibility.
A framework for the development of a global standardised marine taxon reference image database (SMarTaR-ID) to support image-based analyses
Video and image data are regularly used in the field of benthic ecology to document biodiversity. However, their use is subject to a number of challenges, principally the identification of taxa within the images without associated physical specimens. The challenge of applying traditional taxonomic keys to the identification of fauna from images has led to the development of personal, group, or institution level reference image catalogues of operational taxonomic units (OTUs) or morphospecies. Lack of standardisation among these reference catalogues has led to problems with observer bias and the inability to combine datasets across studies. In addition, lack of a common reference standard is stifling efforts in the application of artificial intelligence to taxon identification. Using the North Atlantic deep sea as a case study, we propose a database structure to facilitate standardisation of morphospecies image catalogues between research groups and support future use in multiple front-end applications. We also propose a framework for coordination of international efforts to develop reference guides for the identification of marine species from images. The proposed structure maps to the Darwin Core standard to allow integration with existing databases. We suggest a management framework where high-level taxonomic groups are curated by a regional team, consisting of both end users and taxonomic experts. We identify a mechanism by which overall quality of data within a common reference guide could be raised over the next decade. Finally, we discuss the role of a common reference standard in advancing marine ecology and supporting sustainable use of this ecosystem.
Activity, assay and target data curation and quality in the ChEMBL database
The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application.
A data citation roadmap for scientific publishers
This article presents a practical roadmap for scholarly publishers to implement data citation in accordance with the Joint Declaration of Data Citation Principles (JDDCP), a synopsis and harmonization of the recommendations of major science policy bodies. It was developed by the Publishers Early Adopters Expert Group as part of the Data Citation Implementation Pilot (DCIP) project, an initiative of FORCE11.org and the NIH BioCADDIE program. The structure of the roadmap presented here follows the \"life of a paper\" workflow and includes the categories Pre-submission, Submission, Production, and Publication. The roadmap is intended to be publisher-agnostic so that all publishers can use this as a starting point when implementing JDDCP-compliant data citation. Authors reading this roadmap will also better know what to expect from publishers and how to enable their own data citations to gain maximum impact, as well as complying with what will become increasingly common funder mandates on data transparency.
Machine learning tools match physician accuracy in multilingual text annotation
In the medical field, text annotation involves categorizing clinical and biomedical texts with specific medical categories, enhancing the organization and interpretation of large volumes of unstructured data. This process is crucial for developing tools such as speech recognition systems, which help medical professionals reduce their paperwork. It addresses a significant cause of burnout reported by up to 60% of medical staff. However, annotating medical texts in languages other than English poses unique challenges and necessitates using advanced models. In our research, conducted in collaboration with Gdańsk University of Technology and the Medical University of Gdańsk, we explore strategies to tackle these challenges. We evaluated the performance of various tools and models in recognizing medical terms within a comprehensive vocabulary, comparing these tools’ outcomes with annotations made by medical experts. Our study specifically examined categories such as ‘Drugs’, ‘Diseases and Symptoms’, ‘Procedures’, and ‘Other Medical Terms’, contrasting human expert annotations with the performance of popular multilingual chatbots and natural language processing (NLP) tools on translated texts. The conclusion drawn from our statistical analysis reveals that no significant differences were detected between the groups we examined. This suggests that the tools and models we tested are, on average, similarly effective—or ineffective—at recognizing medical terms as categorized by our specific criteria. Our findings highlight the challenges in bridging the gap between human and machine accuracy in medical text annotation, especially in non-English contexts, and emphasize the need for further refinement of these technologies.
Somatic cancer variant curation and harmonization through consensus minimum variant level data
Background To truly achieve personalized medicine in oncology, it is critical to catalog and curate cancer sequence variants for their clinical relevance. The Somatic Working Group (WG) of the Clinical Genome Resource (ClinGen), in cooperation with ClinVar and multiple cancer variant curation stakeholders, has developed a consensus set of minimal variant level data (MVLD). MVLD is a framework of standardized data elements to curate cancer variants for clinical utility. With implementation of MVLD standards, and in a working partnership with ClinVar, we aim to streamline the somatic variant curation efforts in the community and reduce redundancy and time burden for the interpretation of cancer variants in clinical practice. Methods We developed MVLD through a consensus approach by i) reviewing clinical actionability interpretations from institutions participating in the WG, ii) conducting extensive literature search of clinical somatic interpretation schemas, and iii) survey of cancer variant web portals. A forthcoming guideline on cancer variant interpretation, from the Association of Molecular Pathology (AMP), can be incorporated into MVLD. Results Along with harmonizing standardized terminology for allele interpretive and descriptive fields that are collected by many databases, the MVLD includes unique fields for cancer variants such as Biomarker Class, Therapeutic Context and Effect. In addition, MVLD includes recommendations for controlled semantics and ontologies. The Somatic WG is collaborating with ClinVar to evaluate MVLD use for somatic variant submissions. ClinVar is an open and centralized repository where sequencing laboratories can report summary-level variant data with clinical significance, and ClinVar accepts cancer variant data. Conclusions We expect the use of the MVLD to streamline clinical interpretation of cancer variants, enhance interoperability among multiple redundant curation efforts, and increase submission of somatic variants to ClinVar, all of which will enhance translation to clinical oncology practice.
Standardization of electroencephalography for multi-site, multi-platform and multi-investigator studies: insights from the canadian biomarker integration network in depression
Subsequent to global initiatives in mapping the human brain and investigations of neurobiological markers for brain disorders, the number of multi-site studies involving the collection and sharing of large volumes of brain data, including electroencephalography (EEG), has been increasing. Among the complexities of conducting multi-site studies and increasing the shelf life of biological data beyond the original study are timely standardization and documentation of relevant study parameters. We present the insights gained and guidelines established within the EEG working group of the Canadian Biomarker Integration Network in Depression (CAN-BIND). CAN-BIND is a multi-site, multi-investigator, and multi-project network supported by the Ontario Brain Institute with access to Brain-CODE, an informatics platform that hosts a multitude of biological data across a growing list of brain pathologies. We describe our approaches and insights on documenting and standardizing parameters across the study design, data collection, monitoring, analysis, integration, knowledge-translation, and data archiving phases of CAN-BIND projects. We introduce a custom-built EEG toolbox to track data preprocessing with open-access for the scientific community. We also evaluate the impact of variation in equipment setup on the accuracy of acquired data. Collectively, this work is intended to inspire establishing comprehensive and standardized guidelines for multi-site studies.