Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
12 result(s) for "Gerbracht, Jeff"
Sort by:
Can Observation Skills of Citizen Scientists Be Estimated Using Species Accumulation Curves?
Volunteers are increasingly being recruited into citizen science projects to collect observations for scientific studies. An additional goal of these projects is to engage and educate these volunteers. Thus, there are few barriers to participation resulting in volunteer observers with varying ability to complete the project's tasks. To improve the quality of a citizen science project's outcomes it would be useful to account for inter-observer variation, and to assess the rarely tested presumption that participating in a citizen science projects results in volunteers becoming better observers. Here we present a method for indexing observer variability based on the data routinely submitted by observers participating in the citizen science project eBird, a broad-scale monitoring project in which observers collect and submit lists of the bird species observed while birding. Our method for indexing observer variability uses species accumulation curves, lines that describe how the total number of species reported increase with increasing time spent in collecting observations. We find that differences in species accumulation curves among observers equates to higher rates of species accumulation, particularly for harder-to-identify species, and reveals increased species accumulation rates with continued participation. We suggest that these properties of our analysis provide a measure of observer skill, and that the potential to derive post-hoc data-derived measurements of participant ability should be more widely explored by analysts of data from citizen science projects. We see the potential for inferential results from analyses of citizen science data to be improved by accounting for observer skill.
Birds of the World: A global reference for avian life histories and a case study of incompatible taxonomies
Life history accounts and taxonomic monographs are a series of publications covering a higher taxonomic group where each account is a compilation of existing knowledge detailing many aspects of a species life history. These life history accounts are extensively used by researchers, ornithologists and conservationists as a main source for the current state of knowledge of a species. Birds, being one of the more easily seen and studied taxa, have a number of specialized life history accounts where data from a wide variety of disciplines are combined into a single easily accessible resource. The Cornell Lab of Ornithology (CLO) currently manages two of these series focused on different regions of the world, Birds of North America (BNA) and Neotropical Birds (NB). Lynx Edicions has published the Handbook of Birds of the World (HBW), an extensive set of avian monographs covering every species of bird in the world. A recently announced collaboration between CLO and Lynx Edicions provides us with the opportunity to bring together the extreme detail of the life history accounts from Birds of North America with the global coverage of HBW to produce a global, in-depth treatment of every species of bird in the world. The integration of life history information from these existing projects with different underlying taxonomies presents a variety of real-world examples of the challenges to be overcome to bring these life history accounts into alignment and provide the scientific and lay communities with taxonomically accurate and up to date information. The Handbook of Birds of the World currently follows the HBW and BirdLife Taxonomic Checklist v3 (with 11,126 species recognized) while Birds of North America and Neotropical Birds both follow the eBird/Clements checklist of birds of the world: v2018 (with 10,585 species recognized). Of the roughly 11,000 species of birds, nearly 9,500 are direct matches between HBW/BirdLife and Clements at the species or species to subspecies levels. The remaining concept mismatches fall into several basic categories including lump and split differences as well as differences in which subspecies are included or excluded. In this talk we will discuss the challenges we have faced with managing and merging life history accounts where the underlying taxonomies are fundamentally different. With a requirement to ensure that life history accounts remain accurate when the underlying concepts of the original sources differ, we employ a variety of processes, some very labor intensive and some requiring in-depth taxonomic knowledge to produce consolidated species accounts. Existing resources are integral to these type of integrations and in addition to the taxonomies themselves, cross-taxonomy mapping databases such as Avibase are key. Working through this process of consolidating life history accounts highlights the basic need for taxonomic management and publication toolsets built on underlying taxonomic and life history standards. Cross institutional collaboration to produce these toolsets will be key to their development and successful adoption across the biodiversity and taxonomic communities. I will also discuss and propose a set of taxonomic management tools based on taxonomic concepts, some which already exist and are used by bird taxonomists to annually update the Clements Checklist and some which need to be implemented before we can accurately manage and consolidate biodiversity information and the evolving taxonomies on which those data are based.
COSA: Cloud Object Storage Archive for deep archival of digital data
The Cornell Lab of Ornithology gathers, utilizes and archives a wide variety of digital assets ranging from details of a bird observation to photos, video and sound recordings. Some of these datasets are fairly small, while others are hundreds of terabytes. In this presentation we will describe how the Lab archives these datasets to ensure the data are both loss-less and recoverable in the case of a widespread disaster, how the archival strategy has evolved over the years and explore in detail the current hybrid cloud storage management system. The Lab runs eBird and several other citizen science programs focused on birds where individuals from around the globe enter their sightings into a centralized database. The eBird project alone stores over 500,000,000 observations and the underlying database is over a terabyte in size. Birds of North America, Neotropical Birds and All About Birds are online species accounts comprising a wide range of authoritative live history articles maintained in a relatively small database. Macaulay Library is the world’s largest image, sound and video archive with over 6,000,000 cuts totaling nearly 100 TB of data. The Bioacoustics Research Program utilizes automated recording units (SWIFTs) in the forests of the US, jungles of Africa and in all seven oceans to record the environment. These units record 24 hours a day and gather a tremendous about of raw data, over 200 TB to date with an expected rate of an additional 100TB per year. Lastly, BirdCams run by the lab add a steady stream of media detailing the reproductive cycles of a number of species. The lab is committed to making these archives of the natural world available for research and conservation today. More importantly, ensuring these data exist and are accessible in 100 years is a critical component of the Lab data strategy. The data management system for these digital assets has been completely overhauled to handle the rapidly increasing volume and to utilize on-premises systems and cloud services in a hybrid cloud storage system to ensure data are archived in a manner that is redundant, loss-less and insulated from disasters yet still accessible for research. With multimedia being the largest and most rapidly growing block of data, cost rapidly becomes a constraining factor of archiving these data in redundant, geographically isolated facilities. Datasets with a smaller footprint, eBIrd and species accounts allow for a wider variety of solutions as cost is less of a factor. Using different methods to take advantage of differing technologies and balancing cost vs recovery speed, the Lab has implemented several strategies based on data stability (eBird data are constantly changing), retrieval frequency required for research and overall size of the dataset. We utilize Amazon S3 and Glacier as our media archive, we tag each media in Glacier with a set of basic DarwinCore metatdata fields that key back to a master metadata database and numerous project specific databases. Because these metadata databases are much smaller in size, yet critical in searching and retrieval of a required media file, they are archived differently with up to the minute replication to prevent any data loss due to an unexpected disaster. The media files are tagged with a standard set of basic metadata and in the case where the metadata databases were unavailable, retrieval of specific media and basic metadata can still occur. This system has allowed the lab to place into long term archive hundreds of terabytes of data, store them in redundant, geographically isolated locations and provide for complete disaster recovery of the data and metadata.
A Content Management System and underlying models for avian taxonomic monographs
Taxonomic monographs are a series of publications covering a higher taxonomic group with each monograph focusing on an individual species. They are a compendium of the current state of research and knowledge detailing many aspects of the species and are extensively used by researchers, ornithologists and conservationists to learn what is ‘currently’ known about a species. Birds, being one of the more easily seen and studied taxa, have a number of specialized taxonomic monographs where data from a wide variety of disciplines are combined into a single place and utilized for research and conservation management. Many of the existing avian monographs have regional or subdomain focus such as “Birds of the Western Palearctic” or “Catalan Breeding Bird Atlas 1999-2002” and monographs are sometimes focused on different user communities, ranging from those with casual interest to professional ornithologists and researchers. The Lab of Ornithology maintains several monograph series. Merlin and All About Birds include simplified information that is of interest to the casual observer and Birds of North America and Neotropical Birds Online are monographs with complete, detailed life histories, prepared for ornithologists and active researchers. These monograph projects were originally supported using different Content Management Systems which became very difficult to maintain, difficult to keep content current and provided no capacity for organizing and sharing of content across monograph projects. Bird taxonomies change annually and the previous systems had no capacity to intelligently manage taxonomic changes. To solve these issues, we created a new Content Management System with Taxonomic Concepts at its core. Reviewing a number of existing monograph projects led us to create an underlying content structure that is very analogous to Plinian Core. The initial requirement to support multiple monograph series, some focused on the professional community and others focused on budding amateurs, presented challenges to creating a ‘one size fits all’ model for structuring content that includes authoritative articles covering most aspects of a species life history, traditional range maps, dynamic observation maps, relative abundance models, photos, images, video and a bibliography. In this talk I’ll present in detail the Content Management System and the underlying models we have developed. Four of these five models are tied to the underlying taxonomic concept while the fifth is tied to the taxonomic names. Articles, multimedia (including traditional range maps), taxonomic description and bibliography have long existed in print monographs and having these authored and displayed via the web makes it much simpler to incorporate new information and, keep the information current and publish the information to an existing standard. The incorporation of dynamic content has only been possible with the advent of the web and standards for the underlying Taxonomic Concepts. With four monographs currently in production and several more in development, we’ve encountered both advantages and disadvantages in using these models for managing and serving monograph series. I will discuss these in detail and compare the models with Plinian Core to highlight both fundamental differences as well as common ground.
Species Information pages, how are the data discovered, consolidated and presented
A number of different projects consolidate species information from widely disparate datasets and compile them into a single resource. These projects vary in several dimensions, including taxonomic coverage, depth of information and audience, such as humans or machines. Some focus on Life History information, others focus on observations and specimens or taxonomies and phylogenies. Encyclopedia of Life (eol.org) was one of the early projects and in 2007 took on the challenge of creating a web page for every species in the world, from bacteria to birds. Other projects focused on specific taxonomic groups or regions such as FishBase (fishbase.org) and Atlas of Living Australia (ALA). Efforts such as the Global Biodiversity Infromation Facility (GBIF) consolidate observational data globally. At least 5 projects focus solely on the life histories of birds includingBirds of North America, Neotropical Birds, Handbook of Birds of the World Alive (HBW) and others. The species data included can range from genomic sequences to studies on demography and behavior, from photos and sound recordings to museum specimens. All these various resources are scattered around the globe and discovering the data of interest and accurately resolving the data to the correct ‘species’ is an ongoing and significant challenge. Publishing taxonomic concepts is still in it infancy, yet is key to discovering and resolving these types of data. Additionally, biological and environmental trait data are often consolidated within a species account, yet the discovery of these data is frequently a difficult and labor, intensive process. In this talk, we will review Jaguar, a content management system (CMS) being used by the Cornell Lab of Ornithology to manage species account projects focused on birds and currently includesBirds of North America, Neotropical Birds, MerlinandAll About Birds. This custom CMS was designed with taxonomic concepts at the foundation and utilizing these taxonomic concepts, species accounts are automatically extended with observation maps, multimedia and results from various big data analysis projects. A set of common trait data associated with species is managed using controlled vocabularies and displayed within these species accounts. We have defined a set of traits, focused on birds, that are generally known and which are most useful to a broad ornithological audience. We will discuss challenges we have faced in managing these species accounts and future opportunities to extend and enhance these accounts, especially as taxonomic concepts are published and adopted and trait ontologies are defined and, most importantly, applied.
Best Practices for using Cloud Services for Digital Data Archive and Disaster Recovery
Managing digital data for long-term archival and disaster recovery is a key component of our collective responsibility in managing digital data and metadata. As more and more data are collected digitally and as the metadata for traditional museum collections becomes both digitized and more comprehensive, the need to ensure that these data are safe and accessible in the long term becomes essential. Unfortunately, disasters do occur and many irreplaceable datasets on biodiversity have been permanently lost. Maintaining a long-term archive and putting in place reliable disaster recovery processes can be prohibitively expensive, both in the cost of hardware and software as well as the costs of personnel to manage and maintain an archival system. Traditionally, storing digital data for the long term and ensuring the data are loss-less, safe and completely recoverable when a disaster occurs has been managed on-premises with a combination of on-site and off-site storage. This requires complex data workflows to ensure that all data are securely and redundantly stored in multiple highly dispersed locations to minimize the threat of data loss due to local or regional disasters. Files are often moved multiple times across operating systems and media types on their way to and from a deep archive, increasing the risk of file integrity issues. With the recent advent of an array of Cloud Services from organizations such as Amazon, Microsoft and Google to more focused offerings from Iron Mountain, Atempo and others, we have a number of options for long term archival of digital data. Deep archive solutions, storage where retrieval expected only in the case of a disaster, are offered by many of these organizations at a rate substantially less than their normal data storage fees. The most basic requirement for an archival system is storing multiple replicates of the data in geographically isolated locations with a mechanism for guaranteeing file integrity, usually using a checksum algorithm. Additional components that are integral to a robust archive include a simple metadata search and reliable retrieval. In this presentation, we’ll discuss the need for long term archive and disaster recovery capabilities, detail the current best practices of data archival systems and review a variety of archival options that have become available with Cloud Services.
Can Biodiversity Data Scientists Document Volunteer and Professional Collaborations and Contributions in the Biodiversity Data Enterprise?
The collection, archiving and use of biodiversity data depend on a network of pipelines herein called the Biodiversity Data Enterprise (BDE) and best understood globally through the work of the Global Biodiversity Information Facility (GBIF). Efforts to sustain and grow the BDE require information about the data pipeline and the infrastructure that supports it. A host of metrics from GBIF, including institutional participation (member countries, institutional contributors, data publishers), biodiversity coverage (occurrence records, species, geographic extent, data sets) and data usage (records downloaded, published papers using the data) (Miller 2021), document the rapid growth and successes of the BDE (GBIF Secretariat 2022). Heberling et al. (2021) make a convincing case that the data integration process is working. The Biodiversity Information Standards' (TDWG) Basis of Record term provides information about the underlying infrastructure. It categorizes the kinds of processes*1 that teams undertake to capture biodiversity information and GBIF quantifies their contributions*2 (Table 1). Currently 83.4% of observations come from human observations, of which 63% are of birds. Museum preserved specimens account for 9.5% of records. In both cases, a combination of volunteers (who make observations, collect specimens, digitize specimens, transcribe specimen labels) and professionals work together to make records available. To better understand how the BDE is working, we suggest that it would be of value to know the number of contributions and contributors and their hours of engagement for each data set. This can help the community address questions such as, \"How many volunteers do we need to document birds in a given area?\" or \"How much professional support is required to run a camera trap network?\" For example, millions of observations were made by tens of thousands of observers in two recent BioBlitz events, one called Big Day, focusing on birds, sponsored by the Cornell Laboratory of Ornithology and the other called the City Nature Challenge, addressing all taxa, sponsored jointly by the California Academy of Sciences and the Natural History Musuems of Los Angeles County (Table 2). In our presentation we will suggest approaches to deriving metrics that could be used to document the collaborations and contribution of volunteers and staff using examples from both Human Observation (eBird, iNaturalist) and Preserved Specimen (DigiVol, Notes from Nature) record types. The goal of the exercise is to start a conversation about how such metrics can further the development of the BDE.
eBird: A Human / Computer Learning Network to Improve Biodiversity Conservation and Research
eBird is a citizen‐science project that takes advantage of the human observational capacity to identify birds to species, and uses these observations to accurately represent patterns of bird occurrences across broad spatial and temporal extents. eBird employs artificial intelligence techniques such as machine learning to improve data quality by taking advantage of the synergies between human computation and mechanical computation. We call this a human/computer learning network, whose core is an active learning feedback loop between humans and machines that dramatically improves the quality of both and thereby continually improves the effectiveness of the net‐ work as a whole. In this article we explore how human/computer learning networks can leverage the contributions of human observers and process their contributed data with artificial intelligence algorithms leading to a computational power that far exceeds the sum of the individual parts.