Catalogue Search | MBRL

Towards global data products of Essential Biodiversity Variables on species traits

by Jones, Owen R. , Agosti, Donat , Bowser, Anne in 631/158 , 631/158/670 , 704/158

2018

Essential Biodiversity Variables (EBVs) allow observation and reporting of global biodiversity change, but a detailed framework for the empirical derivation of specific EBVs has yet to be developed. Here, we re-examine and refine the previous candidate set of species traits EBVs and show how traits related to phenology, morphology, reproduction, physiology and movement can contribute to EBV operationalization. The selected EBVs express intra-specific trait variation and allow monitoring of how organisms respond to global change. We evaluate the societal relevance of species traits EBVs for policy targets and demonstrate how open, interoperable and machine-readable trait data enable the building of EBV data products. We outline collection methods, meta(data) standardization, reproducible workflows, semantic tools and licence requirements for producing species traits EBVs. An operationalization is critical for assessing progress towards biodiversity conservation and sustainable development goals and has wide implications for data-intensive science in ecology, biogeography, conservation and Earth observation. Essential Biodiversity Variables (EBVs) are intended to provide standardized measurements for reporting biodiversity change. Here, the authors outline the conceptual and empirical basis for the use of EBVs based on species traits, and highlight tools necessary for creating comprehensive EBV data products.

Journal Article

Share this book

Add to My Shelf

Taxonomic Treatments as Open FAIR Digital Objects

by Agosti, Donat , Ioannidis-Pantopikos, Alexandros in biocenosis , Biodiversity , Bioinformatics

2022

Taxonomy is the science of charting and describing the worlds biodiversity. Organisms are grouped into taxa which are given a given rank building the taxonomic hierarchy. The taxa are described in taxonomic treatments, well defined sections of scientific publications (Catapano 2019). They include a nomenclatural section and one or more sections including descriptions, material citations referring to studied specimens, or notes ecology and behavior. In case the treatment does not describe a new discovered taxon, previous treatments are cited in the form of treatment citations. This citation can refer to a previous treatment and add additional data, or it can be a statement synonymizing the taxon with another taxon. This allows building a citation network, and ultimately is a constituent part of the catalogue of life. Thus treatments play an important role to understand the diversity of life on Earth by providing the scientific argument why group of organism is a new species, or a synonym, and the data provided will increasingly be important to analyze and compare whole genomes of individual genomes. Treatments have been extracted by Plazi since 2008 (Agosti and Egloff 2009), and the TaxPub schema has been described by Catapano (Catapano 2019) to complement existing vocabularies to allow annotation of legacy literature and to produce new publications including the respective annotations (Penev et al. 2010). Today, more than 750,000 treatments have been annotated by Plazi’s TreatmenBank and over 400,000 have been made FAIR digital objects in the Biodiversity Literature Repository in a collaboration of Plazi, Zenodo and Pensoft (Ioannidis-Pantopikos and Agosti 2021, Agosti et al. 2019), and are reused by the Global Biodiversity Information Facility (GBIF), Global Biotic Interaction (GloBI), and the Library System of the Swiss Institute of Bioinformatics (SIBiLS). Each treatment on the Zenodo repository is findable through its rich metadata. The insertion of custom metadata in Zenodo provides metadata referring to domain specific vocabularies such as Darwin Core (Ioannidis-Pantopikos and Agosti 2021). The treatment are accessible through its DataCite Digital Object Identifier (DOI) for the taxonomic treatment as subtype of a publication. The data is interoperable by machine actionable JSON version of the treatment. A license is provided to assure it is reusable. The richness of data and citations within a treatment provide a stepping stone to add treatments not only to knowledge systems such as Wikidata or openBioDiv, but to provide links to many of the cited objects, such as specimens through the material citations, and thus a well curated assemblage of links. Being a FAIR digital object, treatments can be cited and should ultimately linked to from a taxonomic name used in an identification of an organism.

Journal Article

Share this book

Add to My Shelf

Shuttleworth Fellowship Application 2016

by Agosti, Donat in applications , biodive , fellowship

2016

This is the application by Donat Agosti for a Shuttleworth Fellowship with the goal of making scientific data publication an integral part of an open knowledge management system. A successful application will help move TreatmentBank from its current prototype state into a production system that converts and extracts data from taxonomic publications and makes them available as Linked Open Data. The project is about developing the infrastructure and creating a corpus of data, while at the same time serving as a showcase of open access. This is an exemplar approach that hopefully will be adopted by other scientific domains.

Journal Article

Share this book

Add to My Shelf

A global database of ant species abundances

by Jenkins, Clinton , Majer, Jonathan , Suarez, Andrew in abundance , Animals , Ants

2017

What forces structure ecological assemblages? A key limitation to general insights about assemblage structure is the availability of data that are collected at a small spatial grain (local assemblages) and a large spatial extent (global coverage). Here, we present published and unpublished data from 51,388 ant abundance and occurrence records of more than 2,693 species and 7,953 morphospecies from local assemblages collected at 4,212 locations around the world. Ants were selected because they are diverse and abundant globally, comprise a large fraction of animal biomass in most terrestrial communities, and are key contributors to a range of ecosystem functions. Data were collected between 1949 and 2014, and include, for each geo-referenced sampling site, both the identity of the ants collected and details of sampling design, habitat type, and degree of disturbance. The aim of compiling this data set was to provide comprehensive species abundance data in order to test relationships between assemblage structure and environmental and biogeographic factors. Data were collected using a variety of standardized methods, such as pitfall and Winkler traps, and will be valuable for studies investigating large-scale forces structuring local assemblages. Understanding such relationships is particularly critical under current rates of global change. We encourage authors holding additional data on systematically collected ant assemblages, especially those in dry and cold, and remote areas, to contact us and contribute their data to this growing data set.

Journal Article

Share this book

Add to My Shelf

From literature to biodiversity data: mining arthropod organismal traits with machine learning

by Agosti, Donat , Cornelius, Joseph , Waterhouse, Robert in Arthropoda , Arthropods , Biodiversity

2025

The fields of taxonomy and biodiversity research have witnessed an exponential growth in published literature. This vast corpus of articles holds information on the diverse biological traits of organisms and their ecologies. However, access to and extraction of relevant data from this extensive resource remain challenging. Advances in text and data mining (TDM) and Natural Language Processing (NLP) techniques offer new opportunities for liberating such information from literature. Testing and using such approaches to annotate articles in machine-actionable formats is, therefore, necessary to enable the exploitation of existing knowledge in new biology, ecology and evolution research. Here, we explore the potential of these methods to annotate and extract organismal trait data for the most diverse animal group on Earth, the arthropods. The article processing workflow uses manually curated trait dictionaries with trained NLP models to perform labelling of entities and relationships of thousands of articles. A subset of manually annotated documents facilitated the formal evaluation of the performance of the workflow in terms of entity recognition and normalisation and relationship extraction, highlighting several important technical challenges. The results are made available to the scientific community through an interactive web tool and queryable resource, the ArTraDB Arthropod Trait Database. These methodological explorations provide a framework that could be extended beyond the arthropods, where TDM and NLP approaches applied to the taxonomy and biodiversity literature will greatly facilitate data synthesis studies and literature reviews, the identification of knowledge gaps and biases, as well as the data-informed investigation of ecological and evolutionary trends and patterns.

Journal Article

Share this book

Add to My Shelf

OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system

by Senderov, Viktor , Agosti, Donat , Simov, Kiril in Algorithms , Analysis , Biodiversity

2018

Background The biodiversity domain, and in particular biological taxonomy, is moving in the direction of semantization of its research outputs. The present work introduces OpenBiodiv-O, the ontology that serves as the basis of the OpenBiodiv Knowledge Management System. Our intent is to provide an ontology that fills the gaps between ontologies for biodiversity resources, such as DarwinCore-based ontologies, and semantic publishing ontologies, such as the SPAR Ontologies. We bridge this gap by providing an ontology focusing on biological taxonomy. Results OpenBiodiv-O introduces classes, properties, and axioms in the domains of scholarly biodiversity publishing and biological taxonomy and aligns them with several important domain ontologies (FaBiO, DoCO, DwC, Darwin-SW, NOMEN, ENVO). By doing so, it bridges the ontological gap across scholarly biodiversity publishing and biological taxonomy and allows for the creation of a Linked Open Dataset (LOD) of biodiversity information (a biodiversity knowledge graph) and enables the creation of the OpenBiodiv Knowledge Management System. A key feature of the ontology is that it is an ontology of the scientific process of biological taxonomy and not of any particular state of knowledge. This feature allows it to express a multiplicity of scientific opinions. The resulting OpenBiodiv knowledge system may gain a high level of trust in the scientific community as it does not force a scientific opinion on its users (e.g. practicing taxonomists, library researchers, etc.), but rather provides the tools for experts to encode different views as science progresses. Conclusions OpenBiodiv-O provides a conceptual model of the structure of a biodiversity publication and the development of related taxonomic concepts. It also serves as the basis for the OpenBiodiv Knowledge Management System.

Journal Article

Share this book

Add to My Shelf

Liberate the power of biodiversity literature as FAIR digital objects

by Agosti, Donat , Bénichou, Laurence , Casino, Ana in artificial intelligence , biodiversity , data retrieval

2024

Knowledge about biodiversity is largely embedded in a daily growing corpus of over 500 million pages of biodiversity literature that is not machine-actionable. It is thus not open to building a biodiversity knowledge graph, or facilitating the use of artificial intelligence tools. This hinders the completion of a much-needed taxonomic name reference system, prevents the discovery of the biotic interactions underpinning the prediction and understanding of global change trends and consequences, viral spillovers, annotation of genes with their respective phenotypes, and their citations in various domains dealing with biological species such as conservation, agriculture, medicine, life sciences and industry, necessary to achieve the objectives of the Green Deal and address the targets identified in the Global Biodiversity Framework. This Policy Brief highlights key actions that can liberate the scientific data published, exploit their use , promote an enhanced way to publish, and ultimately foster excellence and innovation in biodiversity science, monitoring and conservation.

Journal Article

Share this book

Add to My Shelf

Taxonomic information exchange and copyright: the Plazi approach

by Egloff, Willi , Agosti, Donat in Biology , Biomedical and Life Sciences , Biomedicine

2009

Background A large part of our knowledge on the world's species is recorded in the corpus of biodiversity literature with well over hundred million pages, and is represented in natural history collections estimated at 2 – 3 billion specimens. But this body of knowledge is almost entirely in paper-print form and is not directly accessible through the Internet. For the digitization of this literature, new territories have to be chartered in the fields of technical, legal and social issues that presently impede its advance. The taxonomic literature seems especially destined for such a transformation. Discussion Plazi was founded as an association with the primary goal of transforming both the printed and, more recently, \"born-digital\" taxonomic literature into semantically enabled, enhanced documents. This includes the creation of a test body of literature, an XML schema modeling its logic content (TaxonX), the development of a mark-up editor (GoldenGATE) allowing also the enhancement of documents with links to external resources via Life Science Identifiers (LSID), a repository for publications and issuance of bibliographic identifiers, a dedicated server to serve the marked up content (the Plazi Search and Retrieval Server, SRS) and semantic tools to mine information. Plazi's workflow is designed to respect copyright protection and achieves extraction by observing exceptions and limitations existent in international copyright law. Conclusion The information found in Plazi's databases – taxonomic treatments as well as the metadata of the publications – are in the public domain and can therefore be used for further scientific research without any restriction, whether or not contained in copyrighted publications.

Journal Article

Share this book

Add to My Shelf

Joint statement by CETAF, SPNHC and BHL on DATA within scientific publications: clarification of noncopyrightability

by Agosti, Donat , Rinaldo, Constance , Buschbom, Jutta in Academic publications , Automation , Biodiversity

2023

The EU and other states have made legislative efforts to clarify data mining in copyrightable works, but the situation remains obscure and confusing, especially in a globalised field where international legislation can contribute to opacity. The present paper aims at asserting a common position of three communities representing biodiversity sciences and data specialists on this issue and to propose common and best practice guidelines so that they become universally accepted rules. As scientific data users, we take the standpoint that scientific data are not copyrightable and, furthermore, they can be accessed, shared and reused freely. Thus, once legal access has been gained to copyrighted publications, the data within those scholarly publications can be considered to be open data that is freely extractable. This set of recommendations has been reached specifically for scientific use and societal benefits.

Journal Article

Share this book

Add to My Shelf

A Formicine in New Jersey Cretaceous Amber (Hymenoptera: Formicidae) and Early Evolution of the Ants

by Donat Agosti , David Grimaldi in Amber , anatomy and morphology , Animals

2000

A worker ant preserved with microscopic detail has been discovered in Turonian-aged New Jersey amber [ca. 92 mega-annum (Ma)]. The apex of the gaster has an acidopore and, thus, allows definitive assignment of the fossil to the large extant subfamily Formicinae, members of which use a defensive spray of formic acid. This specimen is the only Cretaceous record of the subfamily, and only two other fossil ants are known from the Cretaceous that unequivocally belong to an extant subfamily (Brownimecia and Canapone of the Ponerinae, in New Jersey and Canadian amber, respectively). In lieu of a cladogram of formicine genera, generalized morphology of this fossil suggests a basal position in the subfamily. Formicinae and Ponerinae in the mid Cretaceous indicate divergence of basal lineages of ants near the Albian (ca. 105-110 Ma) when they presumably diverged from the Sphecomyrminae. Sphecomyrmines are the plesiomorphic sister group to all other ants, or they are a paraphyletic stem group ancestral to all other ants-they apparently became extinct in the Late Cretaceous. Ant abundance in major deposits of Cretaceous and Tertiary insects indicates that they did not become common and presumably dominant in terrestrial ecosystems until the Eocene (ca. 45 Ma). It is at this time that modern genera that form very large colonies (at least 10,000 individuals) first appear. During the Cretaceous, eusocial termites, bees, and vespid wasps also first appear-they show a similar pattern of diversification and proliferation in the Tertiary. The Cretaceous ants have further implications for interpreting distributions of modern ants.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter