Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
665
result(s) for
"taxonomic databases"
Sort by:
A Global Assessment of Distribution, Diversity, Endemism, and Taxonomic Effort in the Rubiaceae
by
Davis, Aaron P.
,
Govaerts, Rafaël
,
Moat, Justin
in
Biological taxonomies
,
Botanical gardens
,
Endemic species
2009
Analyses of distribution, diversity, endemism, and taxonomic effort for Rubiaceae are reported, based on queries from a World Rubiaceae Checklist database. Rubiaceae are widespread and occur in ail major régions of the world except the Antarctic Continent, but are predominantly a group in the tropics with greatest diversity in low-to mid-altitude humid forests. A count of Rubiaceae species and genera is given (13,143 spp./611 genera), which confirms that this is the fourth largest angiosperm family. Psychotria L. is the largest genus in the Rubiaceae (1834 spp.) and the third largest angiosperm genus. Most genera (72%) have fewer than 10 species and 211 are monotypic. Calculation of relative species diversity and percentage endemism enables areas of high diversity and endemism to be enumerated, and identifies areas where further field collecting and taxonomic research are required. Endemism is generally high in Rubiaceae, which supports data from recent studies showing that many species have restricted distributions. Given thè assumed ecologie sensitivity of Rubiaceae, in combination with a range of other factors including restricted distribution, we suggest that species in this family are particularly vulnerable to extinction. The rate at which new species are being described is inadequate; more resources are required before the diversity of Rubiaceae is satisfactorily enumerated.
Journal Article
Harmonizing taxon names in biodiversity data: A review of tools, databases and best practices
by
Berti, Emilio
,
Carvajal‐Quintero, Juan
,
Sagouis, Alban
in
Applications programs
,
Best practice
,
Biodiversity
2023
The process of standardizing taxon names, taxonomic name harmonization, is necessary to properly merge data indexed by taxon names. The large variety of taxonomic databases and related tools are often not well described. It is often unclear which databases are actively maintained or what is the original source of taxonomic information. In addition, software to access these databases is developed following non‐compatible standards, which creates additional challenges for users. As a result, taxonomic harmonization has become a major obstacle in ecological studies that seek to combine multiple datasets. Here, we review and categorize a set of major taxonomic databases publicly available as well as a large collection of R packages to access them and to harmonize lists of taxon names. We categorized available taxonomic databases according to their taxonomic breadth (e.g. taxon specific vs. multi‐taxa) and spatial scope (e.g. regional vs. global), highlighting strengths and caveats of each type of database. We divided R packages according to their function, (e.g. syntax standardization tools, access to online databases, etc.) and highlighted overlaps among them. We present our findings (e.g. network of linkages, data and tool characteristics) in a ready‐to‐use Shiny web application (available at: https://mgrenie.shinyapps.io/taxharmonizexplorer/). We also provide general guidelines and best practice principles for taxonomic name harmonization. As an illustrative example, we harmonized taxon names of one of the largest databases of community time series currently available. We showed how different workflows can be used for different goals, highlighting their strengths and weaknesses and providing practical solutions to avoid common pitfalls. To our knowledge, our opinionated review represents the most exhaustive evaluation of links among and of taxonomic databases and related R tools. Finally, based on our new insights in the field, we make recommendations for users, database managers and package developers alike.
Journal Article
WorldFlora: An R package for exact and fuzzy matching of plant names against the World Flora Online taxonomic backbone data
2020
Premise The standardization of plant names is a critical step in various fields of biology, including biodiversity, biogeography, and vegetation research. The WorldFlora package is introduced here to help achieve this goal by matching lists of plant names with a static copy from World Flora Online (WFO), an ongoing global effort to complete an online flora of all known vascular plants and bryophytes by 2020. Methods and Results Based on direct and fuzzy matching, WorldFlora inserts matching cases from the WFO to a submitted data set containing taxonomic names. The results and success rates for selecting the expected best single matches are presented for four data sets, including two data sets used in recent comparisons of software tools for correcting taxon names. Conclusions WorldFlora offers a straightforward pipeline for semi‐automatic plant name checking. For the four data sets, the success rate of credible matches ranged from 94.7% to 99.9%.
Journal Article
Recommendations for the Standardisation of Open Taxonomic Nomenclature for Image-Based Identifications
by
Vandepitte, Leen
,
Bett, Brian J.
,
Gates, Andrew R.
in
BBNJ Agreement
,
Biodiversity
,
biodiversity informatics
2021
This paper recommends best practice for the use of open nomenclature (ON) signs applicable to image-based faunal analyses. It is one of numerous initiatives to improve biodiversity data input to improve the reliability of biological datasets and their utility in informing policy and management. Image-based faunal analyses are increasingly common but have limitations in the level of taxonomic precision that can be achieved, which varies among groups and imaging methods. This is particularly critical for deep-sea studies owing to the difficulties in reaching confident species-level identifications of unknown taxa. ON signs indicate a standard level of identification and improve clarity, precision and comparability of biodiversity data. Here we provide examples of recommended usage of these terms for input to online databases and preparation of morphospecies catalogues. Because the processes of identification differ when working with physical specimens and with images of the taxa, we build upon previously provided recommendations for specific use with image-based identifications.
Journal Article
Treemendous: an R package for integrating taxonomic information across backbones
by
Paz, Andrea
,
Maynard, Daniel S.
,
Specker, Felix
in
Authorship
,
Biodiversity
,
Biodiversity research
2024
Standardizing and translating species names from different databases is key to the successful integration of data sources in biodiversity research. There are numerous taxonomic name-resolution applications that implement increasingly powerful name-cleaning and matching approaches, allowing the user to resolve species relative to multiple backbones simultaneously. Yet there remains no principled approach for combining information across these underlying taxonomic backbones, complicating efforts to combine and merge species lists with inconsistent and conflicting taxonomic information. Here, we present Treemendous, an open-source software package for the R programming environment that integrates taxonomic relationships across four publicly available backbones to improve the name resolution of tree species. By mapping relationships across the backbones, this package can be used to resolve datasets with conflicting and inconsistent taxonomic origins, while ensuring the resulting species are accepted and consistent with a single reference backbone. The user can chain together different functionalities ranging from simple matching to a single backbone, to graph-based iterative matching using synonym-accepted relations across all backbones in the database. In addition, the package allows users to ‘translate’ one tree species list into another, streamlining the assimilation of new data into preexisting datasets or models. The package provides a flexible workflow depending on the use case, and can either be used as a stand-alone name-resolution package or in conjunction with existing packages as a final step in the name-resolution pipeline. The Treemendous package is fast and easy to use, allowing users to quickly merge different data sources by standardizing their species names according to the regularly updated database. By combining taxonomic information across multiple backbones, the package increases matching rates and minimizes data loss, allowing for more efficient translation of tree species datasets to aid research into forest biodiversity and tree ecology.
Journal Article
KSGP 3.1: improved taxonomic annotation of Archaea communities using LotuS2, the genome taxonomy database and RNAseq data
by
Aleidan, Abdullah
,
Grant, Alastair
,
Fritscher, Joachim
in
Annotations
,
Archaea
,
Biological Systematics
2025
Taxonomic annotation is a substantial challenge for Archaea metabarcoding. A limited number of reference sequences are available; a substantial fraction of phylogenetic diversity is not fully characterized; widely used databases do not reflect current archaeal taxonomy and contain mislabelled sequences. We address these gaps with a systematic and tractable approach based around the Genome Taxonomy Database (GTDB) combined with the eukaryote PR2 and MIDORI mitochondrial databases. After removing incongruent, chimeric and duplicate SSU sequences, this combination (GTDB+) provides a small improvement in annotation of a set of estuarine Archaea Operational Taxonomic Units (OTUs) compared to SILVA. We add to this a collection of near full length rRNA sequences and the prokaryote SSU sequences in SILVA, creating a new reference database, KSGP (Karst, Silva, GTDB, and PR2). The additional sequences are (re-)annotated using three different approaches. The most conservative, using lowest common ancestor, gives a further small improvement. Annotation using SINTAX increases Class and Order assignments by 2.7 and 4.2 times over SILVA, although this may include some “lumping” of un-named and named clades. Still further improvement can be made using similarity based clustering to group database sequences into putative taxa at all taxonomic levels, assigning 60% and 41% of Archaea OTUs to putative family and genus level taxa respectively. GTDB without cleaning and GreenGenes2 both perform poorly and cannot be recommended for use with Archaea. We make the GTDB+ and KSGP databases available at ksgp.earlham.ac.uk; integrate them into a metabarcoding pipeline, LotuS2 and outline their use to annotate Archaea OTUs and metatranscriptomic data.
Journal Article
Ten years and a million links: building a global taxonomic library connecting persistent identifiers for names, publications and people
2023
A major gap in the biodiversity knowledge graph is a connection between taxonomic names and the taxonomic literature. While both names and publications often have persistent identifiers (PIDs), such as Life Science Identifiers (LSIDs) or Digital Object Identifiers (DOIs), LSIDs for names are rarely linked to DOIs for publications. This article describes efforts to make those connections across three large taxonomic databases: Index Fungorum, International Plant Names Index (IPNI) and the Index of Organism Names (ION). Over a million names have been matched to DOIs or other persistent identifiers for taxonomic publications. This represents approximately 36% of names for which publication data are available. The mappings between LSIDs and publication PIDs are made available through ChecklistBank. Applications of this mapping are discussed, including a web app to locate the citation of a taxonomic name and a knowledge graph that uses data on researcher ORCID ids to connect taxonomic names and publications to authors of those names.
Journal Article
A Global Assessment of Distribution, Diversity, Endemism, and Taxonomic Effort in the Rubiaceae1
2009
Analyses of distribution, diversity, endemism, and taxonomic effort for Rubiaceae are reported, based on queries from a World Rubiaceae Checklist database. Rubiaceae are widespread and occur in all major regions of the world except the Antarctic Continent, but are predominantly a group in the tropics with greatest diversity in low- to mid-altitude humid forests. A count of Rubiaceae species and genera is given (13,143 spp./611 genera), which confirms that this is the fourth largest angiosperm family. Psychotria L. is the largest genus in the Rubiaceae (1834 spp.) and the third largest angiosperm genus. Most genera (72%) have fewer than 10 species and 211 are monotypic. Calculation of relative species diversity and percentage endemism enables areas of high diversity and endemism to be enumerated, and identifies areas where further field collecting and taxonomic research are required. Endemism is generally high in Rubiaceae, which supports data from recent studies showing that many species have restricted distributions. Given the assumed ecologic sensitivity of Rubiaceae, in combination with a range of other factors including restricted distribution, we suggest that species in this family are particularly vulnerable to extinction. The rate at which new species are being described is inadequate; more resources are required before the diversity of Rubiaceae is satisfactorily enumerated.
Journal Article
Adapting mark-recapture methods to estimating accepted species-level diversity: a case study with terrestrial Gastropoda
2022
We introduce a new method of estimating accepted species diversity by adapting mark-recapture methods to comparisons of taxonomic databases. A taxonomic database should become more complete over time, so the error bar on an estimate of its completeness and the known diversity of the taxon it treats will decrease. Independent databases can be correlated, so we use the time course of estimates comparing them to understand the effect of correlation. If a later estimate is significantly larger than an earlier one, the databases are positively correlated, if it is significantly smaller, they are negatively correlated, and if the estimate remains roughly constant, then the correlations have averaged out. We tested this method by estimating how complete MolluscaBase is for accepted names of terrestrial gastropods. Using random samples of names from an independent database, we determined whether each name led to a name accepted in MolluscaBase. A sample tested in August 2020 found that 16.7% of tested names were missing; one in July 2021 found 5.3% missing. MolluscaBase grew by almost 3,000 accepted species during this period, reaching 27,050 species. The estimates ranged from 28,409 ± 365 in 2021 to 29,063 ± 771 in 2020. All estimates had overlapping 95% confidence intervals, indicating that correlations between the databases did not cause significant problems. Uncertainty beyond sampling error added 475 ± 430 species, so our estimate for accepted terrestrial gastropods species at the end of 2021 is 28,895 ± 630 species. This estimate is more than 4,000 species higher than previous ones. The estimate does not account for ongoing flux of species into and out of synonymy, new discoveries, or changing taxonomic methods and concepts. The species naming curve for terrestrial gastropods is still far from reaching an asymptote, and combined with the additional uncertainties, this means that predicting how many more species might ultimately be recognized is presently not feasible. Our methods can be applied to estimate the total number of names of Recent mollusks (as opposed to names currently accepted), the known diversity of fossil mollusks, and known diversity in other phyla.
Journal Article
A worldwide geographical scheme for recording the distribution of marine biota: proposal and call for feedback
2025
This paper describes a project aimed at creating a worldwide set of polygons for recording marine distribution data, parallel to the current World Geographic Scheme for Recording Plant Distribution used on land. The countries’ Exclusive Economic Zones were either taken as recording units or subdivided according to Marine Ecosystems of the World or the IHO Limits of Oceans and Seas when appropriate; existing local schemes were adopted for Europe and Australia. A hierarchical set of five Level-1 units, 26 Level-2 units, 232 Level-3 units and 536 Level-4 units is presented for feedback and intended to be submitted as a standard to the Biodiversity Information Standards (TDWG). This project is expected to provide a means to instantly retrieve national checklists for any taxonomic group and also a valuable tool to handle imprecise country-level records from the old literature.
Journal Article