Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
15 result(s) for "Chaumeil, Pierre-Alain"
Sort by:
A complete domain-to-species taxonomy for Bacteria and Archaea
The Genome Taxonomy Database is a phylogenetically consistent, genome-based taxonomy that provides rank-normalized classifications for ~150,000 bacterial and archaeal genomes from domain to genus. However, almost 40% of the genomes in the Genome Taxonomy Database lack a species name. We address this limitation by using commonly accepted average nucleotide identity criteria to set bounds on species and propose species clusters that encompass all publicly available bacterial and archaeal genomes. Unlike previous average nucleotide identity studies, we chose a single representative genome to serve as the effective nomenclatural ‘type’ defining each species. Of the 24,706 proposed species clusters, 8,792 are based on published names. We assigned placeholder names to the remaining 15,914 species clusters to provide names to the growing number of genomes from uncultivated species. This resource provides a complete domain-to-species taxonomic framework for bacterial and archaeal genomes, which will facilitate research on uncultivated species and improve communication of scientific results.A full species classification is built for all publicly available bacterial and archaeal genomes.
A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life
Interpretation of microbial genome data will be improved by a fully revised bacterial taxonomy. Taxonomy is an organizing principle of biology and is ideally based on evolutionary relationships among organisms. Development of a robust bacterial taxonomy has been hindered by an inability to obtain most bacteria in pure culture and, to a lesser extent, by the historical use of phenotypes to guide classification. Culture-independent sequencing technologies have matured sufficiently that a comprehensive genome-based taxonomy is now possible. We used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence. Under this approach, 58% of the 94,759 genomes comprising the Genome Taxonomy Database had changes to their existing taxonomy. This result includes the description of 99 phyla, including six major monophyletic units from the subdivision of the Proteobacteria, and amalgamation of the Candidate Phyla Radiation into a single phylum. Our taxonomy should enable improved classification of uncultured bacteria and provide a sound basis for ecological and evolutionary studies.
A standardized archaeal taxonomy for the Genome Taxonomy Database
The accrual of genomic data from both cultured and uncultured microorganisms provides new opportunities to develop systematic taxonomies based on evolutionary relationships. Previously, we established a bacterial taxonomy through the Genome Taxonomy Database. Here, we propose a standardized archaeal taxonomy that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence. The resulting archaeal taxonomy, which forms part of the Genome Taxonomy Database, is stable for a range of phylogenetic variables including marker gene selection, inference methods, corrections for rate heterogeneity and compositional bias, tree rooting scenarios and expansion of the genome database. Rank normalization is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets. Taxonomic curation follows the rules of the International Code of Nomenclature of Prokaryotes while taking into account proposals to formally recognize the rank of phylum and to use genome sequences as type material. This taxonomy is based on 2,392 archaeal genomes, 93.3% of which required one or more changes to their existing taxonomy, mainly owing to incomplete classification. We identify 16 archaeal phyla and reclassify 3 major monophyletic units from the former Euryarchaeota and one phylum that unites the Thaumarchaeota–Aigarchaeota–Crenarchaeota–Korarchaeota (TACK) superphylum into a single phylum. Resolving widespread incomplete and uneven archaeal classifications is achieved by a rank-normalized genome-based taxonomy.
Proposal of names for 329 higher rank taxa defined in the Genome Taxonomy Database under two prokaryotic codes
Abstract The Genome Taxonomy Database (GTDB) is a taxonomic framework that defines prokaryotic taxa as monophyletic groups in concatenated protein reference trees according to systematic criteria. This has resulted in a substantial number of changes to existing classifications (https://gtdb.ecogenomic.org). In the case of union of taxa, GTDB names were applied based on the priority of publication. The division of taxa or change in rank led to the formation of new Latin names above the rank of genus that were only made publicly available via the GTDB website without associated published taxonomic descriptions. This has sometimes led to confusion in the literature and databases. A number of the provisional GTDB names were later published in other studies, while many still lack authorships. To reduce further confusion, here we propose names and descriptions for 329 GTDB-defined prokaryotic taxa, 223 of which are suitable for validation under the International Code of Nomenclature of Prokaryotes (ICNP) and 49 under the Code of Nomenclature of Prokaryotes described from Sequence Data (SeqCode). For the latter, we designated 23 genomes as type material. An additional 57 taxa that do not currently satisfy the validation criteria of either code are proposed as Candidatus. Provisional names earlier given to 329 GTDB-defined prokaryotic taxa are proposed according to the validation criteria of the International Code of Nomenclature of Prokaryotes and the SeqCode.
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life
Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter. The recovery of 7,903 bacterial and archaeal metagenome-assembled genomes increases the phylogenetic diversity represented by public genome repositories and provides the first representatives from 20 candidate phyla.
Evaluation of a concatenated protein phylogeny for classification of tailed double-stranded DNA viruses belonging to the order Caudovirales
Viruses of bacteria and archaea are important players in global carbon cycling as well as drivers of host evolution, yet the taxonomic classification of viruses remains a challenge due to their genetic diversity and absence of universally conserved genes. Traditional classification approaches employ a combination of phenotypic and genetic information which is no longer scalable in the era of bulk viral genome recovery through metagenomics. Here, we evaluate a phylogenetic approach for the classification of tailed double-stranded DNA viruses from the order Caudovirales by inferring a phylogeny from the concatenation of 77 single-copy protein markers using a maximum-likelihood method. Our approach is largely consistent with the International Committee on Taxonomy of Viruses, with 72 and 89% congruence at the subfamily and genus levels, respectively. Discrepancies could be attributed to misclassifications and a small number of highly mosaic genera confounding the phylogenetic signal. We also show that confidently resolved nodes in the concatenated protein tree are highly reproducible across different software and models, and conclude that the approach can serve as a framework for a rank-normalized taxonomy of most tailed double-stranded DNA viruses. Phylogenetic analysis based on a concatenated set of 77 single-copy marker genes enabled the classification of tailed double-stranded DNA viruses from large datasets, and was reproducible across software and models, providing a framework that could be applied to other viruses.
Author Correction: A complete domain-to-species taxonomy for Bacteria and Archaea
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Genome Taxonomy Database and SeqCode: Microbial Taxonomy and Nomenclature in the Age of Big Sequence Data
Microbial taxonomy and nomenclature have been challenged by methodological advances in high-throughput sequencing and high-performance computing. While taxonomy appears to adapt rapidly and has benefited enormously from the availability of whole-genome sequences, nomenclature still struggles to embrace these changes. Here, we present two independent initiatives that have resulted from the transitions of taxonomic practices in microbiology from a phenotypic and single gene-driven framework to a genome-based driven framework. The first initiative, the Genome Taxonomy Database (GTDB), was developed to address the needs of microbial taxonomists to classify rapidly accumulating genome sequences from both cultured and uncultured microorganisms. Availability of growing numbers of metagenome-assembled genomes (MAGs) and single amplified genomes (SAGs), combined with the genomes from cultured species, created a perfect opportunity for building a consensus classification based on an evolutionary framework. This has been realised in the GTDB, a knowledgebase that provides phylogenetically consistent and rank-normalised taxonomies for bacterial and archaeal genomes. A distinctive feature of GTDB is a complete classification of genomes from species to domain using an automated approach combining average nucleotide identity (ANI) and relative evolutionary divergence (RED), followed by manual curation. GTDB has become an essential taxonomic resource for microbiologists worldwide, attracting ~3,500 users per month. GTDB mainly relies on two public databases, the National Center for Biotechnology Information (NCBI) Assembly database to which GTDB releases are indexed and the List of Prokaryotic names with Standing in Nomenclature (LPSN), as the primary nomenclatural reference. The database operates according to the FAIR (Findable, Accessible, Interoperable, Reusable) data principles and incorporates its own internal (e.g., standards for delineating taxa) as well as external standards. The latter are often directly adopted from the NCBI since it is used as a primary source of genomes as well as metadata. Examples of such standards include Darwin Core data standards from Biodiversity Information Standards (TDWG), Minimum Information (MI) about any (x) Sequence (MIxS) and MISAG and MIMAG standards (Bowers et al. 2017) from the Genomic Standards Consortium. GTDB is used by many third-party resources and provides direct links to external public resources used for curation and validation of taxonomies. Importantly, GTDB contributes to the further generation of knowledge by enabling users to classify their own genomes within the GTDB taxonomic framework using our open-source GTDB-Tk tool. To our knowledge, GTDB is the only database that provides a comprehensive systematic de novo taxonomy for prokaryotes, which serves a multitude of purposes to its global users. The second initiative, the Code of Nomenclature of Prokaryotes Described from Sequence Data or SeqCode, was developed in response to the need for formal naming of uncultured microbial diversity. This need has become even more evident with the establishment of the GTDB taxonomy, which highlighted many issues with nomenclature of uncultured taxa at scale. These include the absence of nomenclatural types, proposed higher taxon names without named children, and the lack of priority for Candidatus names (a prefix indicating a provisional status for the names of organisms falling outside the existing Prokaryotic Code). All these issues arise from one core issue: the absence of regulations for naming uncultured taxa because the International Code of Nomenclature of Prokaryotes (ICNP; Oren et al. 2023) only applies to microorganisms able to be obtained in pure culture. To solve this problem and ultimately to be able to express taxonomic affiliations of uncultured taxa in a regulated manner, genome sequences are proposed to serve as nomenclatural types under the SeqCode. This new code has many common aspects with the ICNP and recognises names that are validly published under the ICNP. It operates via an online Registry that allows registration and validation of names following one of two paths: new names are registered and reviewed prior to publication and validated upon the notification about effective publication, or existing names such as names of Candidatus taxa are registered and reviewed with a validation certificate granted upon the satisfaction of all checks. new names are registered and reviewed prior to publication and validated upon the notification about effective publication, or existing names such as names of Candidatus taxa are registered and reviewed with a validation certificate granted upon the satisfaction of all checks. To avoid naming ambiguity and ensure accurate species descriptions, SeqCode requires that genome sequences designated as types satisfy recommendations on minimal standards for DNA sequences, which are largely adopted from the MISAG and MIMAG standards. The SeqCode Registry also embraces FAIR principles, and was developed with interoperable data structures to facilitate the sharing of its names across global biodiversity resources including GTDB. Recently, we illustrated how SeqCode can be applied, along with the ICNP, by proposing new names for GTDB-defined higher taxonomic names under the two codes (Chuvochina et al. 2023). While it is not ideal to operate under two Prokaryotic codes, we believe that this development is a necessary step towards a unified nomenclatural system.
Author Correction: Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life
In the original version of this Article, the authors stated that the archaeal phylum Parvarchaeota was previously represented by only two single-cell genomes (ARMAN-4_'5-way FS' and ARMAN-5_'5-way FS'). However, these are in fact unpublished, low-quality metagenome-assembled genomes (MAGs) obtained from Richmond Mine, California. In addition, the authors overlooked two higher-quality published Parvarchaeota MAGs from the same habitat, ARMAN-4 (ADCE00000000) and ARMAN-5 (ADHF00000000) (B. J. Baker et al., Proc. Natl Acad. Sci. USA 107 , 8806–8811; 2010). The ARMAN-4 and ARMAN-5 MAGs are estimated to be 68.0% and 76.7% complete with 3.3% and 5.6% contamination, respectively, based on the archaeal-specific marker sets of CheckM. The 11 Parvarchaeota genomes identified in our study were obtained from different Richmond Mine metagenomes, but are highly similar to the ARMAN-4 (ANI of ~99.7%) and ARMAN-5 (ANI of ~99.6%) MAGs. The highest-quality uncultivated bacteria and archaea (UBA) MAGs with similarity to ARMAN-4 and ARMAN-5 are 82.5% and 83.3% complete with 0.9% and 1.9% contamination, respectively. The Parvarchaeota represents only 0.23% of the archaeal genome tree and addition of the ARMAN-4 and ARMAN-5 MAGs do not change the conclusions of this Article, but do impact the phylogenetic gain for this phylum. This has now been corrected in all versions of the Article. An updated version of Fig. 5 has also been used to replace the previous version, with the row for Parvarchaeota removed, and Supplementary Table 15 and Supplementary Table 17 have both been replaced to reflect the availability of the two additional Parvarchaeota genomes. In addition, the Methods incorrectly stated that all metagenomes identified as being from studies where MAGs had previously been recovered were excluded from consideration. Metagenomes from studies where MAGs had previously been recovered were retained if the UBA MAGs provided appreciable improvements in genome quality or phylogenetic diversity. All versions of the Article have been updated to indicate the retention of such metagenomes.
Selection of representative genomes for 24,706 bacterial and archaeal species clusters provide a complete genome-based taxonomy
We recently introduced the Genome Taxonomy Database (GTDB), a phylogenetically consistent, genome-based taxonomy providing rank normalized classifications for nearly 150,000 genomes from domain to genus. However, nearly 40% of the genomes used to infer the GTDB reference tree lack a species name, reflecting the large number of genomes in public repositories without complete taxonomic assignments. Here we address this limitation by proposing 24,706 species clusters which encompass all publicly available bacterial and archaeal genomes when using commonly accepted average nucleotide identity (ANI) criteria for circumscribing species. In contrast to previous ANI studies, we selected a single representative genome to serve as the nomenclatural type for circumscribing each species with type strains used where available. We complemented the 8,792 species clusters with validly or effectively published names with 15,914 de novo species clusters in order to assign placeholder names to the growing number of genomes from uncultivated species. This provides the first complete domain to species taxonomic framework which will improve communication of scientific results. Footnotes * Fixed typo in Acknowledgements