Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
Is Full-Text AvailableIs Full-Text Available
-
YearFrom:-To:
-
More FiltersMore FiltersSubjectCountry Of PublicationPublisherSourceLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
1,586
result(s) for
"Data curation"
Sort by:
Mastering the data paradox : the key to winning in the AI age
2024
There are two remarkable phenomena that are unfolding almost simultaneously. The first is the emergence of a data-first world, where data has become a central driving force, shaping industries and fueling innovation. The second is the dawn of the AI age, propelled by the advent of Generative AI, that has created the possibility to leverage the data of the world for the first time. The convergence of these two, with data as the common denominator, holds immense promise and the opportunities are boundless. This book provides us with opportunities to push our thinking, to innovate, to transform and to create a better future at all levels--individual, enterprise and the world.
Metabolite discovery through global annotation of untargeted metabolomics data
2021
Liquid chromatography–high-resolution mass spectrometry (LC-MS)-based metabolomics aims to identify and quantify all metabolites, but most LC-MS peaks remain unidentified. Here we present a global network optimization approach, NetID, to annotate untargeted LC-MS metabolomics data. The approach aims to generate, for all experimentally observed ion peaks, annotations that match the measured masses, retention times and (when available) tandem mass spectrometry fragmentation patterns. Peaks are connected based on mass differences reflecting adduction, fragmentation, isotopes, or feasible biochemical transformations. Global optimization generates a single network linking most observed ion peaks, enhances peak assignment accuracy, and produces chemically informative peak–peak relationships, including for peaks lacking tandem mass spectrometry spectra. Applying this approach to yeast and mouse data, we identified five previously unrecognized metabolites (thiamine derivatives and N-glucosyl-taurine). Isotope tracer studies indicate active flux through these metabolites. Thus, NetID applies existing metabolomic knowledge and global optimization to substantially improve annotation coverage and accuracy in untargeted metabolomics datasets, facilitating metabolite discovery.The NetID algorithm annotates untargeted LC-MS metabolomics data by combining known biochemical and metabolomic principles with a global network optimization strategy.
Journal Article
Credit data generators for data reuse
by
Statham, Emily
,
Pierce, Heather H.
,
Bierer, Barbara E.
in
706/648/453
,
706/648/479
,
706/648/496
2019
To promote effective sharing, we must create an enduring link between the people who generate data and its future uses, urge Heather H. Pierce and colleagues.
To promote effective sharing, we must create an enduring link between the people who generate data and its future uses, urge Heather H. Pierce and colleagues.
Fluorescence immunohistochemistry and confocal microscopy images of normal and cancerous human tissue samples
Journal Article
3D Data Creation to Curation
2022
3D Data Creation to Curation: Community Standards for 3D Data Preservation collects the efforts of the Community Standards for 3D Data Preservation (CS3DP) initiative--a large practicing community of librarians, researchers, engineers, and designers--to move toward establishment of shared guidelines, practices, and standards. Using a collaborative approach for standards development that promotes individual investment and broad adoption, this group has produced a work that captures the shared preservation needs of the whole community.
Vaccine-related advertising in the Facebook Ad Archive
2020
•First assessment of vaccine-related advertisements on Facebook Ad Archive.•Top pro-vaccine ad themes: vaccine promotion, philanthropy, news.•Top anti-vaccine ad themes: vaccine harm, promoting choice, uncovering “fraud”.•Two buyers accounted for majority (54%) of anti-vaccine advertising content.•Facebook policies negatively impact first time ad buyers, largely pro-vaccine.
In 2018, Facebook introduced Ad Archive as a platform to improve transparency in advertisements related to politics and “issues of national importance.” Vaccine-related Facebook advertising is publicly available for the first time. After measles outbreaks in the US brought renewed attention to the possible role of Facebook advertising in the spread of vaccine-related misinformation, Facebook announced steps to limit vaccine-related misinformation. This study serves as a baseline of advertising before new policies went into effect.
Using the keyword ‘vaccine’, we searched Ad Archive on December 13, 2018 and again on February 22, 2019. We exported data for 505 advertisements. A team of annotators sorted advertisements by content: pro-vaccine, anti-vaccine, not relevant. We also conducted a thematic analysis of major advertising themes. We ran Mann-Whitney U tests to compare ad performance metrics.
309 advertisements were included in analysis with 163 (53%) pro-vaccine advertisements and 145 (47%) anti-vaccine advertisements. Despite a similar number of advertisements, the median number of ads per buyer was significantly higher for anti-vaccine ads. First time buyers are less likely to complete disclosure information and risk ad removal. Thematically, anti-vaccine advertising messages are relatively uniform and emphasize vaccine harms (55%). In contrast, pro-vaccine advertisements come from a diverse set of buyers (83 unique) with varied goals including promoting vaccination (49%), vaccine related philanthropy (15%), and vaccine related policy (14%).
A small set of anti-vaccine advertisement buyers have leveraged Facebook advertisements to reach targeted audiences. By deeming all vaccine-related content an issue of “national importance,” Facebook has further the politicized vaccines. The implementation of a blanket disclosure policy also limits which ads can successfully run on Facebook. Improving transparency and limiting misinformation should not be separate goals. Public health communication efforts should consider the potential impact on Facebook users’ vaccine attitudes and behaviors.
Journal Article
A framework for the development of a global standardised marine taxon reference image database (SMarTaR-ID) to support image-based analyses
by
Taranto, Gerald H.
,
Morato, Telmo
,
Jones, Daniel O. B.
in
Animals
,
Artificial Intelligence
,
Biodiversity
2019
Video and image data are regularly used in the field of benthic ecology to document biodiversity. However, their use is subject to a number of challenges, principally the identification of taxa within the images without associated physical specimens. The challenge of applying traditional taxonomic keys to the identification of fauna from images has led to the development of personal, group, or institution level reference image catalogues of operational taxonomic units (OTUs) or morphospecies. Lack of standardisation among these reference catalogues has led to problems with observer bias and the inability to combine datasets across studies. In addition, lack of a common reference standard is stifling efforts in the application of artificial intelligence to taxon identification. Using the North Atlantic deep sea as a case study, we propose a database structure to facilitate standardisation of morphospecies image catalogues between research groups and support future use in multiple front-end applications. We also propose a framework for coordination of international efforts to develop reference guides for the identification of marine species from images. The proposed structure maps to the Darwin Core standard to allow integration with existing databases. We suggest a management framework where high-level taxonomic groups are curated by a regional team, consisting of both end users and taxonomic experts. We identify a mechanism by which overall quality of data within a common reference guide could be raised over the next decade. Finally, we discuss the role of a common reference standard in advancing marine ecology and supporting sustainable use of this ecosystem.
Journal Article
Activity, assay and target data curation and quality in the ChEMBL database
by
Hersey, Anne
,
Papadatos, George
,
Overington, John P.
in
Accessibility
,
Ambiguity
,
Animal Anatomy
2015
The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application.
Journal Article
Developing a modern data workflow for regularly updated data
by
White, Ethan P.
,
Ernest, S. K. Morgan
,
Christensen, Erica M.
in
Animals
,
Archives & records
,
Archiving
2019
Over the past decade, biology has undergone a data revolution in how researchers collect data and the amount of data being collected. An emerging challenge that has received limited attention in biology is managing, working with, and providing access to data under continual active collection. Regularly updated data present unique challenges in quality assurance and control, data publication, archiving, and reproducibility. We developed a workflow for a long-term ecological study that addresses many of the challenges associated with managing this type of data. We do this by leveraging existing tools to 1) perform quality assurance and control; 2) import, restructure, version, and archive data; 3) rapidly publish new data in ways that ensure appropriate credit to all contributors; and 4) automate most steps in the data pipeline to reduce the time and effort required by researchers. The workflow leverages tools from software development, including version control and continuous integration, to create a modern data management system that automates the pipeline.
Journal Article
Pneumothorax detection in chest radiographs: optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training
by
Ingrisch, Michael
,
Sabel, Bastian O.
,
Fieselmann, Andreas
in
Algorithms
,
Annotations
,
Artificial Intelligence
2021
Objectives
Diagnostic accuracy of artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXR) is limited by the noisy annotation quality of public training data and confounding thoracic tubes (TT). We hypothesize that in-image annotations of the dehiscent visceral pleura for algorithm training boosts algorithm’s performance and suppresses confounders.
Methods
Our single-center evaluation cohort of 3062 supine CXRs includes 760 PTX-positive cases with radiological annotations of PTX size and inserted TTs. Three step-by-step improved algorithms (differing in algorithm architecture, training data from public datasets/clinical sites, and in-image annotations included in algorithm training) were characterized by area under the receiver operating characteristics (AUROC) in detailed subgroup analyses and referenced to the well-established “CheXNet” algorithm.
Results
Performances of established algorithms exclusively trained on publicly available data without in-image annotations are limited to AUROCs of 0.778 and strongly biased towards TTs that can completely eliminate algorithm’s discriminative power in individual subgroups. Contrarily, our final “algorithm 2” which was trained on a lower number of images but additionally with in-image annotations of the dehiscent pleura achieved an overall AUROC of 0.877 for unilateral PTX detection with a significantly reduced TT-related confounding bias.
Conclusions
We demonstrated strong limitations of an established PTX-detecting AI algorithm that can be significantly reduced by designing an AI system capable of learning to both classify and localize PTX. Our results are aimed at drawing attention to the necessity of high-quality in-image localization in training data to reduce the risks of unintentionally biasing the training process of pathology-detecting AI algorithms.
Key Points
• Established pneumothorax-detecting artificial intelligence algorithms trained on public training data are strongly limited and biased by confounding thoracic tubes.
• We used high-quality in-image annotated training data to effectively boost algorithm performance and suppress the impact of confounding thoracic tubes.
• Based on our results, we hypothesize that even hidden confounders might be effectively addressed by in-image annotations of pathology-related image features.
Journal Article