Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

6 result(s) for "Webbink, Kate"

Sort by: Sort by:

Bridging Language Barriers: Lessons from the French Translation of Latimer Core

by Buschbom, Jutta , Webbink, Kate , Saliba, Elie in Biodiversity , Collaboration , Cultural heritage and museology

2025

Internationalization of standards documentation is essential for pursuing global interoperability through the adoption of data standards that can be understood and competently applied throughout the world and across sociocultural contexts. Ratified by Biodiversity Information Standards (TDWG) in 2024, the Latimer Core (LtC) data standard focuses on the representation and discovery of natural science collections (Woodburn et al. 2022). The first complete translation of the LtC standard documentation*1 was published in June 2025 into French facilitating access to the standard for francophone communities. Translating biodiversity data standards such as Latimer Core into French presents a series of intertwined linguistic and technical challenges. The rigor of the translation effort depends on consistent terminology inside a given standard, and is achieved through careful reuse of formulas, such as 'recommended best practice', and the support of translation management tools such as Crowdin*2 to ensure uniformity. In addition to intra-standard consistency, the reuse of Darwin Core terms, which were translated prior to the ratification of Latimer Core (see Saliba et al. (2025)), requires caution when retranslating definitions, to ensure homogeneity across standards. Aside from linguistic elements discussed in part in Saliba et al. (2025), challenges like documenting translation work remain. The latter is largely informal, relying on collaborative platforms and personal notes, which underscore the potential need for more structured, reproducible workflows, especially in the context of multiple translators working together on a given language, notably those with strong regional variants. Similarly, no universal threshold has been defined for the minimum content needed to achieve a “functional” translation. An incremental approach, beginning with labels and definitions and progressively expanding to webpage elements such as headers and footers, non-normative complementary information and supplementary documentation seems to emerge as good practice. To address these issues and others, a recommendation document aimed at defining good practices and workflows for translating standards is being prepared. Finally, the Latimer Core maintenance group is experimenting with having a point of contact for translation to act as a bridge between translators, the standard maintenance group, and users. The point of contact can answer domain-specific questions, gather feedback from users and report errors to the relevant translator. Ensuring that TDWG standards are available in French is a good way to broaden participation among underrepresented scientific communities across Africa, the Caribbean, the Pacific and other francophone regions. Beyond opening doors for these audiences, the translation process itself offers a unique opportunity for contributors to deepen their understanding of a standard while making it, and subsequently connected standards, accessible to others. Far from being a mere technical task, translation is an intellectually rewarding and collaborative endeavor that amplifies the global relevance of TDWG’s work—ultimately enriching both the standards and the communities they serve.

Journal Article

Share this book

Add to My Shelf

No Pain No Gain: Standards mapping in Latimer Core development

by Buschbom, Jutta , Webbink, Kate , Norton, Ben in Biodiversity , exercise , Mapping

2023

Latimer Core (LtC) is a new proposed Biodiversity Information Standards (TDWG) data standard that supports the representation and discovery of natural science collections by structuring data about the groups of objects that those collections and their subcomponents encompass (Woodburn et al. 2022). It is designed to be applicable to a range of use cases that include high level collection registries, rich textual narratives and semantic networks of collections, as well as more granular, quantitative breakdowns of collections to aid collection discovery and digitisation planning. As a standard that is (in this first version) focused on natural science collections, LtC has significant intersections with existing data standards and models (Fig. 1) that represent individual natural science objects and occurrences and their associated data (e.g., Darwin Core (DwC), Access to Biological Collection Data (ABCD), Conceptual Reference Model of the International Committee on Documentation (CIDOC-CRM)). LtC’s scope also overlaps with standards for more generic concepts like metadata, organisations, people and activities (i.e., Dublin Core, World Wide Web Consortium (W3C) ORG Ontology and PROV Ontology, Schema.org). LtC represents just an element of this extended network of data standards for the natural sciences and related concepts. Mapping between LtC and intersecting standards is therefore crucial for avoiding duplication of effort in the standard development process, and ensuring that data stored using the different standards are as interoperable as possible in alignment with FAIR (Findable, Accessible, Interoperable, Reusable) principles. In particular, it is vital to make robust associations between records representing groups of objects in LtC and records (where available) that represent the objects within those groups. During LtC development, efforts were made to identify and align with relevant standards and vocabularies, and adopt existing terms from them where possible. During expert review, a more structured approach was proposed and implemented using the Simple Knowledge Organization System (SKOS) mappingRelation vocabulary. This exercise helped to better describe the nature of the mappings between new LtC terms and related terms in other standards, and to validate decisions around the borrowing of existing terms for LtC. A further exercise also used elements of the Simple Standard for Sharing Ontological Mappings (SSSOM) to start to develop a more comprehensive set of metadata around these mappings. At present, these mappings (Suppl. material 1 and Suppl. material 2) are provisional and not considered to be comprehensive, but should be further refined and expanded over time. Even with the support provided by the SKOS and SSSOM standards, the LtC experience has proven the mapping process to be far from straightforward. Different standards vary in how they are structured, for example, DwC is a ‘bag of terms’, with informal classes and no structural constraints, while more structured standards and ontologies like ABCD and PROV employ different approaches to how structure is defined and documented. The various standards use different metadata schemas and serialisations (e.g., Resource Description Framework (RDF), XML) for their documentation, and different approaches to providing persistent, resolvable identifiers for their terms. There are also many subtle nuances involved in assessing the alignment between the concepts that the source and target terms represent, particularly when assessing whether a match is exact enough to allow the existing term to be adopted. These factors make the mapping process quite manual and labour-intensive. Approaches and tools, such as developing decision trees (Fig. 2) to represent the logic involved and further exploration of the SSSOM standard, could help to streamline this process. In this presentation, we will discuss the LtC experience of the standard mapping process, the challenges faced and methods used, and the potential to contribute this experience to a collaborative standards mapping within the anticipated TDWG Standards Mapping Interest Group.

Journal Article

Share this book

Add to My Shelf

A Data Standard for Dynamic Collection Descriptions

by Droege, Gabriele , Webbink, Kate , Grant, Sharon in Biodiversity , Collections , Consortia

2021

The utopian vision is of a future where a digital representation of each object in our collections is accessible through the internet and sustainably linked to other digital resources. This is a long term goal however, and in the meantime there is an urgent need to share data about our collections at a higher level with a range of stakeholders (Woodburn et al. 2020). To sustainably achieve this, and to aggregate this information across all natural science collections, the data need to be standardised (Johnston and Robinson 2002). To this end, the Biodiversity Information Standards (TDWG) Collection Descriptions (CD) Interest Group has developed a data standard for describing collections, which is approaching formal review for ratification as a new TDWG standard. It proposes 20 classes (Suppl. material 1) and over 100 properties that can be used to describe, categorise, quantify, link and track digital representations of natural science collections, from high-level approximations to detailed breakdowns depending on the purpose of a particular implementation. The wide range of use cases identified for representing collection description data means that a flexible approach to the standard and the underlying modelling concepts is essential. These are centered around the ‘ObjectGroup’ (Fig. 1), a class that may represent any group (of any size) of physical collection objects, which have one or more common characteristics. This generic definition of the ‘collection’ in ‘collection descriptions’ is an important factor in making the standard flexible enough to support the breadth of use cases. For any use case or implementation, only a subset of classes and properties within the standard are likely to be relevant. In some cases, this subset may have little overlap with those selected for other use cases. This additional need for flexibility means that very few classes and properties, representing the core concepts, are proposed to be mandatory. Metrics, facts and narratives are represented in a normalised structure using an extended MeasurementOrFact class, so that these can be user-defined rather than constrained to a set identified by the standard. Finally, rather than a rigid underlying data model as part of the normative standard, documentation will be developed to provide guidance on how the classes in the standard may be related and quantified according to relational, dimensional and graph-like models. So, in summary, the standard has, by design, been made flexible enough to be used in a number of different ways. The corresponding risk is that it could be used in ways that may not deliver what is needed in terms of outputs, manageability and interoperability with other resources of collection-level or object-level data. To mitigate this, it is key for any new implementer of the standard to establish how it should be used in that particular instance, and define any necessary constraints within the wider scope of the standard and model. This is the concept of the ‘collection description scheme,’ a profile that defines elements such as: which classes and properties should be included, which should be mandatory, and which should be repeatable; which controlled vocabularies and hierarchies should be used to make the data interoperable; how the collections should be broken down into individual ObjectGroups and interlinked, and how the various classes should be related to each other. which classes and properties should be included, which should be mandatory, and which should be repeatable; which controlled vocabularies and hierarchies should be used to make the data interoperable; how the collections should be broken down into individual ObjectGroups and interlinked, and how the various classes should be related to each other. Various factors might influence these decisions, including the types of information that are relevant to the use case, whether quantitative metrics need to be captured and aggregated across collection descriptions, and how many resources can be dedicated to amassing and maintaining the data. This process has particular relevance to the Distributed System of Scientific Collections (DiSSCo) consortium, the design of which incorporates use cases for storing, interlinking and reporting on the collections of its member institutions. These include helping users of the European Loans and Visits System (ELViS) (Islam 2020) to discover specimens for physical and digital loans by providing descriptions and breakdowns of the collections of holding institutions, and monitoring digitisation progress across European collections through a dynamic Collections Digitisation Dashboard. In addition, DiSSCo will be part of a global collections data ecosystem requiring interoperation with other infrastructures such as the GBIF (Global Biodiversity Information Facility) Registry of Scientific Collections, the CETAF (Consortium of European Taxonomic Facilities) Registry of Collections and Index Herbariorum. In this presentation, we will introduce the draft standard and discuss the process of defining new collection description schemes using the standard and data model, and focus on DiSSCo requirements as examples of real-world collection descriptions use cases.

Journal Article

Share this book

Add to My Shelf

Latimer Core: A new data standard for collection descriptions

by Buschbom, Jutta , Droege, Gabriele , Webbink, Kate in Biodiversity , Consortia , data collection

2022

The Latimer Core (LtC) schema, named after Marjorie Courtenay-Latimer, is a standard designed to support the representation and discovery of natural science collections by structuring data about the groups of objects that those collections and their subcomponents encompass. Individual items within those groups are represented through other emerging or current standards (e.g., Darwin Core, ABCD). The LtC classes and properties aim to represent information that describes these groupings in enough detail to inform deeper discovery of the resources contained within them. The standard has been developed under the Biodiversity Information Standards (TDWG) Collection Descriptions (CD) Interest Group, and evolved from the earlier work of the Natural Collection Descriptions (NCD) group. Version 1 of the standard includes 23 classes, each with two or more properties (Fig. 1 and Suppl. material 1). The central concept of the standard is the ObjectGroup class, which represents 'an intentionally grouped set of objects with one or more common characteristics'. Arranged around the ObjectGroup are a set of classes that are commonly used to describe and classify the objects within the ObjectGroup, classes covering aspects of the custodianship, management and tracking of the collections, a generic class (MeasurementOrFact) for storing qualitative or quantitative measures within the standard, and a set of classes that are used to describe the structure and description of the dataset. Latimer Core is intended to be sufficiently flexible and scalable to apply to a wide range of collection description use cases, from describing the overall collections holdings of an institution to the contents of a single drawer of material. Various approaches are used to support this flexibility, including the use of generic classes to represent organisations, people, roles and identifiers, and enabling flexible relationships for constructing data models that meet different use cases. The collection description scheme concept is introduced to enable adopters to specify rules in the use of LtC within each specific implementation, demonstrated in Fig. 2. Guidance and reference examples for different modelling approaches to suit different use cases are provided in the LtC guidance documentation. The LtC standard has significant overlap with existing data standards (Suppl. material 2) that represent, for example, individual objects and occurrences, organisations, people and activities. Where possible, LtC has either borrowed terms directly from these standards or less formally aligned with them. Achieving a balance between offering a standard that is sufficiently comprehensive to stand alone and maintains a low technical barrier to adoption whilst minimalising duplication of effort in the context of the wider standards landscape is a notable challenge in the standard development process. The draft standard was submitted to the TDWG Executive in June 2022 to begin the process of formal review and ratification. This includes a list of standard terms and a GitHub wiki of guidance on the concepts behind and use of the standard. In the meantime, the Task Group will continue working on reference examples and serialisations, and working with infrastructures such as the Distributed System of Scientific Collections (DiSSCo) consortium, the GBIF (Global Biodiversity Information Facility) Registry of Scientific Collections, the CETAF (Consortium of European Taxonomic Facilities) Registry of Collections and the Global Genome Biodiversity Network (GGBN) on potential roadmaps towards adoption. In this presentation, we will introduce the key Latimer Core deliverables, highlight some of the challenges faced in the development process, and discuss the potential for community adoption.

Journal Article

Share this book

Add to My Shelf

Repatriation of Augmented Information to an Institutional Database

by Grant, Sharon , Webbink, Kate , Jones, Janeen in biodiversity , birds , Data collection

2018

On the 9th of April 2010 the Field Museum received a momentous email from the ORNIS (ORnithology Network Information System) team informing them that they could now access the products of a nationwide georeferencing project; its bird collection could be, quite literally, put on the map. On the 7th of August 2017 those data (along with the sister datasets from FISHNet (FISH NETwork) and MaNIS (Mammal Network Information System) finally made their way into the Museum’s collection management system. It's easy to get data out, why is it so hard to get it back? To make it easier, what do we need to do in terms of coordination, staffing, and/or technological resources? How can tools like data quality flags better accommodate the needs of data-providers as well as data-users elsewhere along the collections data pipeline? We present a real life case studyof repatriating an enhanced dataset to its institute of origin, including details on timelines, estimates of effort, and lessons learned. The best laid repatriation protocols might not prepare us for everything, but following them more closely might save us some sanity.

Journal Article

Share this book

Add to My Shelf

What’s Missing From All the Portals?

by Grant, Sharon , Russell, Rusty , Webbink, Kate in Australia , Biodiversity , Data collection

2017

At time of writing there are over 784 million occurrence records in the Global Biodiversity Information Facility (GBIF) portal (gbif.org), 106 million on the iDigBio site (idigbio.org); 68 million in the Atlas of Living Australia (ala.org.au) and 20 million in VertNet (vertnet.org). The list of biodiversity aggregators and portals that boast occurrence counts in the millions continues to increase. Combined with sites who gather data their data from outside of the GBIF domain such as ThePaleobiology Database, there is compelling evidence that global digitization is starting to illuminate the black hole of biodiversity data held in collections across the world. The visibility and demands on our collective natural history heritage have never been as high, and they are increasingly in the spotlight with both internal and external audiences. Funding sources have moved away from massive \"digitization for the sake of digitization\" projects and demand much more focused proposals. To compete in this arena, collections staff and researchers must collaborate and mine collections for their strengths and use those to justify efforts. To do this, however, they must have access to information about the non-digitized occurrence level records in the world’s holdings. We discuss the potential use of current TDWG standards to allow the capture of existing institutional data about undigitized collections and also those whose records have been marked as environmentally, culturally, or politically sensitive and so must remain digitally dark, so that portals like GBIF can use them in a comparable way as existing occurrence records. CanDarwin Core(with itsextensions) together with theNatural Collections Description(draft standard) be used to describe accessions, inventory-level information, and backlog estimates in an efficient and effective way and provide even greater visibility of those undigitized occurrences? In addition, can these data also serve as a means to further refine existing digitized records?

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter