Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
302
result(s) for
"David Burstein"
Sort by:
Deciphering microbial gene function using natural language processing
2022
Revealing the function of uncharacterized genes is a fundamental challenge in an era of ever-increasing volumes of sequencing data. Here, we present a concept for tackling this challenge using deep learning methodologies adopted from natural language processing (NLP). We repurpose NLP algorithms to model “gene semantics” based on a biological corpus of more than 360 million microbial genes within their genomic context. We use the language models to predict functional categories for 56,617 genes and find that out of 1369 genes associated with recently discovered defense systems, 98% are inferred correctly. We then systematically evaluate the “discovery potential” of different functional categories, pinpointing those with the most genes yet to be characterized. Finally, we demonstrate our method’s ability to discover systems associated with microbial interaction and defense. Our results highlight that combining microbial genomics and language models is a promising avenue for revealing gene functions in microbes.
The function of many microbial genes is yet unknown. Here the authors repurposed natural language processing algorithms to explore “gene semantics” and infer function for thousands of genes with defense and secretion systems found to have the most discovery potential.
Journal Article
Programmed DNA destruction by miniature CRISPR-Cas14 enzymes
by
Harrington, Lucas B.
,
Kyrpides, Nikos C.
,
Chen, Janice S.
in
Adaptive immunity
,
Adaptive systems
,
Amino acids
2018
CRISPR-Cas9 systems have been causing a revolution in biology. Harrington
et al.
describe the discovery and technological implementation of an additional type of CRISPR system based on an extracompact effector protein, Cas14. Metagenomics data, particularly from uncultivated samples, uncovered the CRISPR-Cas14 systems containing all the components necessary for adaptive immunity in prokaryotes. At half the size of class 2 CRISPR effectors, Cas14 appears to target single-stranded DNA without class 2 sequence restrictions. By leveraging this activity, a fast and high-fidelity nucleic acid detection system enabled detection of single-nucleotide polymorphisms.
Science
, this issue p.
839
Identification, characterization, and technological implementation of additional archaea-derived CRISPR-Cas14 systems are described.
CRISPR-Cas systems provide microbes with adaptive immunity to infectious nucleic acids and are widely employed as genome editing tools. These tools use RNA-guided Cas proteins whose large size (950 to 1400 amino acids) has been considered essential to their specific DNA- or RNA-targeting activities. Here we present a set of CRISPR-Cas systems from uncultivated archaea that contain Cas14, a family of exceptionally compact RNA-guided nucleases (400 to 700 amino acids). Despite their small size, Cas14 proteins are capable of targeted single-stranded DNA (ssDNA) cleavage without restrictive sequence requirements. Moreover, target recognition by Cas14 triggers nonspecific cutting of ssDNA molecules, an activity that enables high-fidelity single-nucleotide polymorphism genotyping (Cas14-DETECTR). Metagenomic data show that multiple CRISPR-Cas14 systems evolved independently and suggest a potential evolutionary origin of single-effector CRISPR-based adaptive immunity.
Journal Article
Two distinct RNase activities of CRISPR-C2c2 enable guide-RNA processing and RNA detection
by
Knight, Spencer C.
,
O’Connell, Mitchell R.
,
Cate, Jamie H. D.
in
631/326/2521
,
631/337/1645
,
631/45/500
2016
The CRISPR-associated bacterial enzyme C2c2 is shown to contain two separable, distinct sites for the highly sensitive detection and cleavage of single-stranded RNA.
The RNA cleaving enzyme C2c2
The programmed sequence-specific cleavage of RNA and DNA by CRISPR-associated enzymes has revolutionized genome editing. An alternative to canonical Cas9 nuclease, C2c2, was recently described. Jennifer Doudna and colleagues have probed the biochemistry of this enzyme further, and find that it contains two separable distinct sites that catalyse RNA cleavage. The authors exploit the properties of the second site to show that the enzyme can be used for highly sensitive detection and cleavage of single-stranded RNA.
Bacterial adaptive immune systems use CRISPRs (clustered regularly interspaced short palindromic repeats) and CRISPR-associated (Cas) proteins for RNA-guided nucleic acid cleavage
1
,
2
. Although most prokaryotic adaptive immune systems generally target DNA substrates
3
,
4
,
5
, type III and VI CRISPR systems direct interference complexes against single-stranded RNA substrates
6
,
7
,
8
,
9
. In type VI systems, the single-subunit C2c2 protein functions as an RNA-guided RNA endonuclease (RNase)
9
,
10
. How this enzyme acquires mature CRISPR RNAs (crRNAs) that are essential for immune surveillance and how it carries out crRNA-mediated RNA cleavage remain unclear. Here we show that bacterial C2c2 possesses a unique RNase activity responsible for CRISPR RNA maturation that is distinct from its RNA-activated single-stranded RNA degradation activity. These dual RNase functions are chemically and mechanistically different from each other and from the crRNA-processing behaviour of the evolutionarily unrelated CRISPR enzyme Cpf1 (ref.
11
). The two RNase activities of C2c2 enable multiplexed processing and loading of guide RNAs that in turn allow sensitive detection of cellular transcripts.
Journal Article
Using big sequencing data to identify chronic SARS-Coronavirus-2 infections
by
Miller, Danielle
,
Harari, Sheri
,
Burstein, David
in
631/114/1305
,
631/181/735
,
631/326/596/4130
2024
The evolution of SARS-Coronavirus-2 (SARS-CoV-2) has been characterized by the periodic emergence of highly divergent variants. One leading hypothesis suggests these variants may have emerged during chronic infections of immunocompromised individuals, but limited data from these cases hinders comprehensive analyses. Here, we harnessed millions of SARS-CoV-2 genomes to identify potential chronic infections and used language models (LM) to infer chronic-associated mutations. First, we mined the SARS-CoV-2 phylogeny and identified chronic-like clades with identical metadata (location, age, and sex) spanning over 21 days, suggesting a prolonged infection. We inferred 271 chronic-like clades, which exhibited characteristics similar to confirmed chronic infections. Chronic-associated mutations were often high-fitness immune-evasive mutations located in the spike receptor-binding domain (RBD), yet a minority were unique to chronic infections and absent in global settings. The probability of observing high-fitness RBD mutations was 10-20 times higher in chronic infections than in global transmission chains. The majority of RBD mutations in BA.1/BA.2 chronic-like clades bore predictive value, i.e., went on to display global success. Finally, we used our LM to infer hundreds of additional chronic-like clades in the absence of metadata. Our approach allows mining extensive sequencing data and providing insights into future evolutionary patterns of SARS-CoV-2.
Chronic SARS-CoV-2 infections have been hypothesised to be sources of new variants. Here, the authors use large-scale genome sequencing data to identify mutations predictive of chronic infections, which may therefore be relevant in future variants.
Journal Article
The distinction of CPR bacteria from other bacteria based on protein family content
by
Méheust, Raphaël
,
Castelle, Cindy J.
,
Banfield, Jillian F.
in
45/23
,
631/114/2784
,
631/181/757
2019
Candidate phyla radiation (CPR) bacteria separate phylogenetically from other bacteria, but the organismal distribution of their protein families remains unclear. Here, we leveraged sequences from thousands of uncultivated organisms and identified protein families that co-occur in genomes, thus are likely foundational for lineage capacities. Protein family presence/absence patterns cluster CPR bacteria together, and away from all other bacteria and archaea, partly due to proteins without recognizable homology to proteins in other bacteria. Some are likely involved in cell-cell interactions and potentially important for episymbiotic lifestyles. The diversity of protein family combinations in CPR may exceed that of all other bacteria. Over the bacterial tree, protein family presence/absence patterns broadly recapitulate phylogenetic structure, suggesting persistence of core sets of proteins since lineage divergence. The CPR could have arisen in an episode of dramatic but heterogeneous genome reduction or from a protogenote community and co-evolved with other bacteria.
Recent studies have identified a large, phylogenetically distinct clade of bacteria, the candidate phyla radiation (CPR). Here, Méheust and colleagues analyze almost 3600 genomes to characterize the protein family content of CPR versus other bacteria and archaea.
Journal Article
Major bacterial lineages are essentially devoid of CRISPR-Cas viral defence systems
by
Probst, Alexander J.
,
Sharon, Itai
,
Brown, Christopher T.
in
38/22
,
631/208/325/2483
,
631/326/596/2148
2016
Current understanding of microorganism–virus interactions, which shape the evolution and functioning of Earth’s ecosystems, is based primarily on cultivated organisms. Here we investigate thousands of viral and microbial genomes recovered using a cultivation-independent approach to study the frequency, variety and taxonomic distribution of viral defence mechanisms. CRISPR-Cas systems that confer microorganisms with immunity to viruses are present in only 10% of 1,724 sampled microorganisms, compared with previous reports of 40% occurrence in bacteria and 81% in archaea. We attribute this large difference to the lack of CRISPR-Cas systems across major bacterial lineages that have no cultivated representatives. We correlate absence of CRISPR-Cas with lack of nucleotide biosynthesis capacity and a symbiotic lifestyle. Restriction systems are well represented in these lineages and might provide both non-specific viral defence and access to nucleotides.
It is thought that CRISPR-Cas systems, which confer acquired immunity to phage and archaeal viruses, are widespread among bacteria and archaea. Here, Burstein
et al.
show that entire lineages of uncultivated microorganisms are essentially devoid of CRISPR-Cas systems.
Journal Article
Association of Primary Care Physician Compensation Incentives and Quality of Care in the United States, 2012-2016
by
Liss, David T
,
Burstein, David S
,
Linder, Jeffrey A
in
Compensation
,
Composite materials
,
Confidence intervals
2022
BackgroundPhysician compensation incentives may have positive or negative effects on clinical quality.ObjectiveTo assess the association between various physician compensation incentives on technical indicators of primary care quality.DesignCross-sectional, nationally representative retrospective analysis.ParticipantsVisits by adults to primary care physicians in the National Ambulatory Medical Care Survey from 2012-2016. We analyzed 49,580 sampled visits, representing 1.45 billion primary care visits.Main MeasuresWe assessed the association between 5 compensation incentives – quality measure performance, patient experience scores, individual productivity, practice financial performance, or practice efficiency – and 10 high-value and 7 low-value care measures as well as high-value and low-value care composites.Key ResultsQuality measure performance was an incentive in 22% of visits; patient experience scores, 17%; individual productivity, 57%; practice financial performance, 63%; and practice efficiency, 12%. In adjusted models, none of the compensation incentives were consistently associated with individual high- and low-value measures. None of the compensation incentives were associated with high- or low-value care composites. For example, quality measure performance compensation was not significantly associated with high-value care (visits with quality incentive, 47% of eligible measures met; without quality incentive, 43%; adjusted odds ratio [aOR], 1.02; 95% confidence interval [CI], 0.91 to 1.15) or low-value care (aOR, 0.99; 95% CI, 0.82-1.19). Physician compensation incentives that might be expected to increase low-value care did not: patient experience (aOR for low-value care composite, 0.83; 95% CI, 0.65-1.05), individual productivity (aOR, 1.03; 95% CI, 0.88-1.22), and practice financial performance (aOR, 1.05; 95% CI, 0.81-1.36).ConclusionIn this retrospective, cross-sectional, nationally representative analysis of care in the United States, physician compensation incentives were not generally associated with more or less high- or low-value care.
Journal Article
DRAMMA: a multifaceted machine learning approach for novel antimicrobial resistance gene detection in metagenomic data
by
Rannon, Ella
,
Shaashua, Sagi
,
Burstein, David
in
Amino acids
,
Anti-Bacterial Agents - pharmacology
,
Bacteria - classification
2025
Background
Antibiotics are essential for medical procedures, food security, and public health. However, ill-advised usage leads to increased pathogen resistance to antimicrobial substances, posing a threat of fatal infections and limiting the benefits of antibiotics. Therefore, early detection of antimicrobial resistance genes (ARGs), especially in pathogens, is crucial for human health. Most computational methods for ARG detection rely on homology to a predefined gene database and therefore are limited in their ability to discover novel genes.
Results
We introduce DRAMMA, a machine learning method for predicting new ARGs with no sequence similarity to known ARGs or any annotated gene. DRAMMA utilizes various features, including protein properties, genomic context, and evolutionary patterns. The model demonstrated robust predictive performance both in cross-validation and an external validation set annotated by an empirical ARG database. Analyses of the high-ranking model-generated candidates revealed a significant enrichment of candidates within the
Bacteroidetes/Chlorobi
and
Betaproteobacteria
taxonomic groups.
Conclusions
DRAMMA enables rapid ARG identification for global-scale genomic and metagenomic samples, thus holding promise for the discovery of novel ARGs that lack sequence similarity to any known resistance genes. Further, our model has the potential to facilitate early detection of specific ARGs, potentially influencing the selection of antibiotics administered to patients.
28NttsADukmnuVvJqK-7qS
Video Abstract
Journal Article
Engineered B cells expressing an anti-HIV antibody enable memory retention, isotype switching and clonal expansion
2020
ABSTRACT
HIV viremia can be controlled by chronic antiretroviral therapy. As a potentially single-shot alternative, B cells engineered by CRISPR/Cas9 to express anti-HIV broadly neutralizing antibodies (bNAbs) are capable of secreting high antibody titers. Here, we show that, upon immunization of mice, adoptively transferred engineered B cells home to germinal centers (GC) where they predominate over the endogenous response and differentiate into memory and plasma cells while undergoing class switch recombination (CSR). Immunization with a high affinity antigen increases accumulation in GCs and CSR rates. Boost immunization increases the rate of engineered B cells in GCs and antibody secretion, indicating memory retention. Finally, antibody sequences of engineered B cells in the spleen show patterns of clonal selection. Therefore, B cells can be engineered into what could be a living and evolving drug.
Chronic antiretroviral therapy does not eradicate HIV infection. Here, the authors describe a potentially one-shot alternative by engineering B cells to express anti-HIV antibodies and undergo memory retention, isotype switching and clonal expansion
Journal Article