Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
80 result(s) for "Horesh, Gal"
Sort by:
Producing polished prokaryotic pangenomes with the Panaroo pipeline
Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here, we introduce Panaroo, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. Panaroo is available at https://github.com/gtonkinhill/panaroo .
Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies.
Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies. This study presents the first uniformly assembled, comprehensively described and searchable dataset of 661,405 bacterial genomes; this resource will empower more scientists to harness the multitude of data in public sequencing archives, but also reveals the biased composition of these archives, with 90% of the data originating from just 20 species.
Different evolutionary trends form the twilight zone of the bacterial pan-genome
Abstract The pan-genome is defined as the combined set of all genes in the gene pool of a species. Pan-genome analyses have been very useful in helping to understand different evolutionary dynamics of bacterial species: an open pan-genome often indicates a free-living lifestyle with metabolic versatility, while closed pan-genomes are linked to host-restricted, ecologically specialised bacteria. A detailed understanding of the species pan-genome has also been instrumental in tracking the phylodynamics of emerging drug resistance mechanisms and drug resistant pathogens. However, current approaches to analyse a species’ pan-genome do not take the species population structure into account, nor do they account for the uneven sampling of different lineages, as is commonplace due to over-sampling of clinically relevant representatives. Here we present the application of a population structure-aware approach for classifying genes in a pan-genome based on within-species distribution. We demonstrate our approach on a collection of 7,500 E. coli genomes, one of the most-studied bacterial species used as a model for an open pan-genome. We reveal clearly distinct groups of genes, clustered by different underlying evolutionary dynamics, and provide a more biologically informed and accurate description of the species’ pan-genome. Competing Interest Statement The authors have declared no competing interest. Footnotes * Corrected surname in author list
A comprehensive and high-quality collection of E. coli genomes and their genes
Abstract Escherichia coli is a highly diverse organism which includes a range of commensal and pathogenic variants found across a range of niches and worldwide. In addition to causing severe intestinal and extraintestinal disease, E. coli is considered a priority pathogen due to high levels of observed drug resistance. The diversity in the E. coli population is driven by high genome plasticity and a very large gene pool. All these have made E. coli one of the most well-studied organisms, as well as a commonly used laboratory strain. Today, there are thousands of sequenced E. coli genomes stored in public databases. While data is widely available, accessing the information in order to perform analyses can still be a challenge. Collecting relevant available data requires accessing different sources, where data may be stored in a range of formats, and often requires further manipulation, and processing to apply various analyses and extract useful information. In this study, we collated and intensely curated a collection of over 10,000 E. coli and Shigella genomes to provide a single, uniform, high-quality dataset. Shigella were included as they are considered specialised pathovars of E. coli. We provide these data in a number of easily accessible formats which can be used as the foundation for future studies addressing the biological differences between E. coli lineages and the distribution and flow of genes in the E. coli population at a high resolution. The analysis we present emphasises our lack of understanding of the true diversity of the E. coli species, and the biased nature of our current understanding of the genetic diversity of such a key pathogen. Author Notes All supporting data have been provided within the article or through supplementary data files. All supporting code is provided in the git repository https://github.com/ghoresh11/ecoli_genome_collection. Significance as a BioResource to the community As of today, there are more than 140,000 E. coli genomes available on public databases. While data is widely available, collating the data and extracting meaningful information from it often requires multiple steps, computational resources and expert knowledge. Here, we collate a high quality and comprehensive set of over 10,000 E. coli genomes, isolated from human hosts, into a set of manageable files that offer an accessible and usable snapshot of the currently available genome data, linked to a minimal data quality standard. The data provided includes a detailed synopsis of the main lineages present, including their antimicrobial and virulence profiles, their complete gene content, and all the associated metadata for each genome. This includes a database which enables the user to compare newly sequenced isolates against the assembled genomes. Additionally, we provide a searchable index which allows the user to query any DNA sequence against the assemblies of the collection. This collection paves the path for many future studies, including those investigating the differences between E. coli lineages, following the evolution of different genes in the E. coli pan-genome and exploring the dynamics of horizontal gene transfer in this important organism. Data Summary 1. The complete aggregated metadata of 10,146 high quality genomes isolated from human hosts (doi.org/10.6084/m9.figshare.12514883, File F1). 2. A PopPUNK database which can be used to query any genome and examine its context relative to this collection (Deposited to doi.org/10.6084/m9.figshare.12650834). 3. A BIGSI index of all the genomes which can be used to easily and quickly query the genomes for any DNA sequence of 61 bp or longer (Deposited to doi.org/10.6084/m9.figshare.12666497). 4. Description and complete profiling the 50 largest lineages which represent the majority of publicly available human-isolated E. coli genomes (doi.org/10.6084/m9.figshare.12514883, File F2). Phylogenetic trees of representative genomes of these lineages, presented in this manuscript, are also provided (doi.org/10.6084/m9.figshare.12514883, Files tree_500.nwk and tree_50.nwk). 5. The complete pan-genome of the 50 largest lineages which includes: 1. A FASTA file containing a single representative sequence of each gene of the gene pool (doi.org/10.6084/m9.figshare.12514883, File F3). 2. Complete gene presence-absence across all isolates (doi.org/10.6084/m9.figshare.12514883, File F4). 3. The frequency of each gene within each of the lineages (doi.org/10.6084/m9.figshare.12514883, File F5). 4. The representative sequences from each lineage for all the genes (doi.org/10.6084/m9.figshare.12514883, File F6). Competing Interest Statement The authors have declared no competing interest. Footnotes * https://doi.org/10.6084/m9.figshare.12514883 * https://doi.org/10.6084/m9.figshare.12650834 * https://doi.org/to10.6084/m9.figshare.12666497 * Abbreviations HGT Horizontal Gene Transfer EPEC Enteropathogenic E. coli ETEC Enterotoxigenic E. coli EHEC Enterohaemorrhagic E. coli EAEC Enteroaggeragive E. coli EIEC Enteroinvasive E. coli; DAEC diffusely adherent E. coli AIEC adherent invasive E. coli ExPEC extraintestinal E. coli CDS coding sequence ST sequence type AMR antimicrobial resistance PHE Public Health England FDA Food and Drug Administration CDC Centers for Disease Control and Prevention GEMS Global Enteric Multicenter Study MDR multidrug resistant SNP Single Nucleotide Polymorphism
Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences
ABSTRACT The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function, and even anthropogenic activities such as the widespread use of antimicrobials. Whilst these archives are rich in data, considerable processing is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes using a uniform standardised approach, retrieved from the European Nucleotide Archive (ENA) in November of 2018. A searchable COBS index has been produced, facilitating the easy interrogation of the entire dataset for a specific gene or mutation. Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. An analysis on this scale revealed the uneven species composition in the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The over-represented species tend to be acute/common human pathogens. This aligns with research priorities at different levels from individuals with targeted but focused research questions, areas of focus for the funding bodies or national public health agencies, to those identified globally as priority pathogens by the WHO for their resistance to front and last line antimicrobials. Understanding the actual and potential biases in bacterial diversity depicted in this snapshot, and hence within the data being submitted to the public sequencing archives, is essential if we are to target and fill gaps in our understanding of the bacterial kingdom. Competing Interest Statement The authors have declared no competing interest. Footnotes * ↵† Joint Authors
Producing Polished Prokaryotic Pangenomes with the Panaroo Pipeline
Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content, resulting from frequent horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here we introduce Panaroo, a graph based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. We verified our approach through extensive simulations of de novo assemblies using the infinitely many genes model and by analysing a number of publicly available large bacterial genome datasets. Using a highly clonal Mycobacterium tuberculosis dataset as a negative control case, we show that failing to account for annotation errors can lead to pangenome estimates that are dominated by error. We additionally demonstrate the utility of the improved graphical output provided by Panaroo by performing a pan-genome wide association study in Neisseria gonorrhoeae and by analysing gene gain and loss rates across 51 of the major global pneumococcal sequence clusters. Panaroo is freely available under an open source MIT licence at https://github.com/gtonkinhill/panaroo. Footnotes * https://github.com/gtonkinhill/panaroo/ * https://github.com/gtonkinhill/panaroo_manuscript * https://doi.org/10.5281/zenodo.3599800
An outburst from a massive star 40 days before a supernova explosion
A mass-loss event 40 days before the explosion of the type IIn supernova SN 2010mc has been detected; the outburst indicates that there is a causal relation between explosive mass-loss events seen in some massive stars before their explosion and the onset of the supernova explosion. Energetic mass loss precedes supernova explosion Various lines of evidence suggest that very massive stars experience extreme mass-loss episodes shortly before they explode as supernovae. This paper reports the observation of one such event: 40 days before the explosion of the type IIn supernova SN 2010mc its progenitor underwent an energetic outburst that released 0.01 solar masses of material at velocities of around 2,000 km per second.The luminosity and velocity of the outburst are consistent with the predictions of the wave-driven pulsation model of supernova explosions. Some observations suggest that very massive stars experience extreme mass-loss episodes shortly before they explode as supernovae 1 , 2 , 3 , 4 , as do several models 5 , 6 , 7 . Establishing a causal connection between these mass-loss episodes and the final explosion would provide a novel way to study pre-supernova massive-star evolution. Here we report observations of a mass-loss event detected 40 days before the explosion of the type IIn supernova SN 2010mc (also known as PTF 10tel). Our photometric and spectroscopic data suggest that this event is a result of an energetic outburst, radiating at least 6 × 10 47  erg of energy and releasing about 10 −2 solar masses of material at typical velocities of 2,000 km s −1 . The temporal proximity of the mass-loss outburst and the supernova explosion implies a causal connection between them. Moreover, we find that the outburst luminosity and velocity are consistent with the predictions of the wave-driven pulsation model 6 , and disfavour alternative suggestions 7 .
The complex circumstellar environment of supernova 2023ixf
The early evolution of a supernova (SN) can reveal information about the environment and the progenitor star. When a star explodes in vacuum, the first photons to escape from its surface appear as a brief, hours-long shock-breakout flare 1 , 2 , followed by a cooling phase of emission. However, for stars exploding within a distribution of dense, optically thick circumstellar material (CSM), the first photons escape from the material beyond the stellar edge and the duration of the initial flare can extend to several days, during which the escaping emission indicates photospheric heating 3 . Early serendipitous observations 2 , 4 that lacked ultraviolet (UV) data were unable to determine whether the early emission is heating or cooling and hence the nature of the early explosion event. Here we report UV spectra of the nearby SN 2023ixf in the galaxy Messier 101 (M101). Using the UV data as well as a comprehensive set of further multiwavelength observations, we temporally resolve the emergence of the explosion shock from a thick medium heated by the SN emission. We derive a reliable bolometric light curve that indicates that the shock breaks out from a dense layer with a radius substantially larger than typical supergiants. Using ultraviolet data as well as a comprehensive set of further multiwavelength observations of the supernova 2023ixf, a reliable bolometric light curve is derived that indicates the heating nature of the early emission.
A hot and fast ultra-stripped supernova that likely formed a compact neutron star binary
Some types of core-collapse supernovae are known to produce a neutron star (NS). A binary NS merger was recently detected from its gravitational wave emission, but it is unclear how such a tight binary system can be formed. De et al. discovered a core-collapse supernova with unusual properties, including the removal of the outer layers of the star before the explosion. They interpret this as the second supernova in an interacting binary system that already contains one NS. Because the explosion probably produced a second NS (rather than a black hole) in a tight orbit, it could be an example of how binary NS systems form. Science , this issue p. 201 An unusual core-collapse supernova appears to have formed a binary neutron star in a tight orbit. Compact neutron star binary systems are produced from binary massive stars through stellar evolution involving up to two supernova explosions. The final stages in the formation of these systems have not been directly observed. We report the discovery of iPTF 14gqr (SN 2014ft), a type Ic supernova with a fast-evolving light curve indicating an extremely low ejecta mass (≈0.2 solar masses) and low kinetic energy (≈2 × 10 50 ergs). Early photometry and spectroscopy reveal evidence of shock cooling of an extended helium-rich envelope, likely ejected in an intense pre-explosion mass-loss episode of the progenitor. Taken together, we interpret iPTF 14gqr as evidence for ultra-stripped supernovae that form neutron stars in compact binary systems.