Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
23 result(s) for "Rumker, Laurie"
Sort by:
Efficient and precise single-cell reference atlas mapping with Symphony
Recent advances in single-cell technologies and integration algorithms make it possible to construct comprehensive reference atlases encompassing many donors, studies, disease states, and sequencing platforms. Much like mapping sequencing reads to a reference genome, it is essential to be able to map query cells onto complex, multimillion-cell reference atlases to rapidly identify relevant cell states and phenotypes. We present Symphony ( https://github.com/immunogenomics/symphony ), an algorithm for building large-scale, integrated reference atlases in a convenient, portable format that enables efficient query mapping within seconds. Symphony localizes query cells within a stable low-dimensional reference embedding, facilitating reproducible downstream transfer of reference-defined annotations to the query. We demonstrate the power of Symphony in multiple real-world datasets, including (1) mapping a multi-donor, multi-species query to predict pancreatic cell types, (2) localizing query cells along a developmental trajectory of fetal liver hematopoiesis, and (3) inferring surface protein expression with a multimodal CITE-seq atlas of memory T cells. The number of single-cell RNA-seq datasets generated is increasing rapidly, making methods that map cell types to well-curated references increasingly important. Here, the authors propose an accurate method for mapping single cells onto a reference atlas in seconds.
Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. The aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world's largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non-mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic. We created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic. We found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories \"economic stress,\" \"isolation,\" and \"home,\" while others such as \"motion\" significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=-0.96, P<.001). Using unsupervised clustering, we found the suicidality and loneliness clusters more than doubled in the number of posts during the pandemic. Specifically, the support groups for borderline personality disorder and posttraumatic stress disorder became significantly associated with the suicidality cluster. Furthermore, clusters surrounding self-harm and entertainment emerged. By using a broad set of NLP techniques and analyzing a baseline of prepandemic posts, we uncovered patterns of how specific mental health problems manifest in language, identified at-risk users, and revealed the distribution of concerns across Reddit, which could help provide better resources to its millions of users. We then demonstrated that textual analysis is sensitive to uncover mental health complaints as they appear in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests.
Co-varying neighborhood analysis identifies cell populations associated with phenotypes of interest from single-cell transcriptomics
As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes, such as clinical phenotypes. Current statistical approaches typically map cells to clusters and then assess differences in cluster abundance. Here we present co-varying neighborhood analysis (CNA), an unbiased method to identify associated cell populations with greater flexibility than cluster-based approaches. CNA characterizes dominant axes of variation across samples by identifying groups of small regions in transcriptional space—termed neighborhoods—that co-vary in abundance across samples, suggesting shared function or regulation. CNA performs statistical testing for associations between any sample-level attribute and the abundances of these co-varying neighborhood groups. Simulations show that CNA enables more sensitive and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, identifies monocyte populations expanded in sepsis and identifies a novel T cell population associated with progression to active tuberculosis. Inter-sample variability reveals disease-associated cell subpopulations in single-cell RNA sequencing.
The genetic basis of autoimmunity seen through the lens of T cell functional traits
Autoimmune disease heritability is enriched in T cell-specific regulatory regions of the genome. Modern-day T cell datasets now enable association studies between single nucleotide polymorphisms (SNPs) and a myriad of molecular phenotypes, including chromatin accessibility, gene expression, transcriptional programs, T cell antigen receptor (TCR) amino acid usage, and cell state abundances. Such studies have identified hundreds of quantitative trait loci (QTLs) in T cells that colocalize with genetic risk for autoimmune disease. The key challenge facing immunologists today lies in synthesizing these results toward a unified understanding of the autoimmune T cell: which genes, cell states, and antigens drive tissue destruction? Genetic risk variants for autoimmune diseases are largely enriched in T cell-specific regulatory regions. In this review, Raychaudhuri and colleagues summarise the findings of recent studies evaluating the genetic regulation of T cell molecular and functional traits in these diseases.
Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease
The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software ( https://github.com/immunogenomics/HLA_analyses_tutorial ). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations. This tutorial provides guidelines for imputing human leukocyte antigen alleles, including standard quality control measures for input genotyping data and best practice recommendations for association testing and fine-mapping to identify causal alleles.
Multidomain analyses of a longitudinal human microbiome intestinal cleanout perturbation experiment
Our work focuses on the stability, resilience, and response to perturbation of the bacterial communities in the human gut. Informative flash flood-like disturbances that eliminate most gastrointestinal biomass can be induced using a clinically-relevant iso-osmotic agent. We designed and executed such a disturbance in human volunteers using a dense longitudinal sampling scheme extending before and after induced diarrhea. This experiment has enabled a careful multidomain analysis of a controlled perturbation of the human gut microbiota with a new level of resolution. These new longitudinal multidomain data were analyzed using recently developed statistical methods that demonstrate improvements over current practices. By imposing sparsity constraints we have enhanced the interpretability of the analyses and by employing a new adaptive generalized principal components analysis, incorporated modulated phylogenetic information and enhanced interpretation through scoring of the portions of the tree most influenced by the perturbation. Our analyses leverage the taxa-sample duality in the data to show how the gut microbiota recovers following this perturbation. Through a holistic approach that integrates phylogenetic, metagenomic and abundance information, we elucidate patterns of taxonomic and functional change that characterize the community recovery process across individuals. We provide complete code and illustrations of new sparse statistical methods for high-dimensional, longitudinal multidomain data that provide greater interpretability than existing methods.
Designing sensitive viral diagnostics with machine learning
Design of nucleic acid-based viral diagnostics typically follows heuristic rules and, to contend with viral variation, focuses on a genome’s conserved regions. A design process could, instead, directly optimize diagnostic effectiveness using a learned model of sensitivity for targets and their variants. Toward that goal, we screen 19,209 diagnostic–target pairs, concentrated on CRISPR-based diagnostics, and train a deep neural network to accurately predict diagnostic readout. We join this model with combinatorial optimization to maximize sensitivity over the full spectrum of a virus’s genomic variation. We introduce Activity-informed Design with All-inclusive Patrolling of Targets (ADAPT), a system for automated design, and use it to design diagnostics for 1,933 vertebrate-infecting viral species within 2 hours for most species and within 24 hours for all but three. We experimentally show that ADAPT’s designs are sensitive and specific to the lineage level and permit lower limits of detection, across a virus’s variation, than the outputs of standard design techniques. Our strategy could facilitate a proactive resource of assays for detecting pathogens. Viral diagnostics with maximum sensitivity are designed using machine learning and combinatorial optimization.
Modeling Inter-Individual Variation in Single-Cell Datasets to Detect Cell State Abundance Associations to Clinical Features and Genetic Variants
In order to understand disease development, create effective medical treatments, and predict clinical outcomes, researchers study body tissue sampled from a wide variety of human donors. Researchers seek to detect tissue characteristics that associate with donor attributes like disease risk or treatment response. The advent of single-cell genomic technologies has enabled unbiased acquisition of diverse measurements, such as gene expression or chromatin accessibility, for each cell in a tissue sample. Single-cell datasets reveal the complexity of tissue composition in the human body at unprecedented resolution and provide new opportunities to detect tissue associations to donor attributes. In particular, single-cell datasets offer new opportunities to reveal what kinds of cells, among many possible “cell states,” associate in abundance with donor attributes of interest. However, existing association-testing approaches are anchored in researcher-driven choices about which cell states are most relevant and these choices can limit the scope of associations detected. This thesis presents a novel approach that offers more flexible and data-driven identification of cell states associated in abundance with donor attributes. Our approach enables researchers to take better advantage of the rich information available in single-cell datasets and has already offered new insight into diseases including tuberculosis and systemic lupus erythematosus.The first portion of this thesis introduces our novel framework for cell state abundance association testing. This framework leverages both the granularity of cell states and the variation across donors that are revealed in tissues by single-cell datasets. In this framework, we quantify cell abundance per donor across many granular cell states termed “neighborhoods” and uncover patterns of neighborhood abundance variation that are shared across donors. We illustrate how this framework produces a set of derived tissue features per donor that can be used to improve statistical power and accuracy in the detection of cell state abundance associations relative to the existing paradigm. We apply this framework to single-cell datasets of blood tissue to characterize immune dysfunction in autoimmunity and infection.Modeling cell state abundance associations at fine-grained resolution offers important advantages, but also necessitates new considerations for potential sources of confounding. The second portion of this thesis characterizes a source of confounding in cell state abundance association testing to which neighborhood-resolution models of single-cell data are particularly vulnerable. We also introduce a strategy to address this confounding that offers benefits across multiple neighborhood-based association-testing tools.Cell states that are differentially abundant between donors with and without a disease may result from disease development processes or disease sequelae. However, genetic variants that confer an elevated risk of disease may also associate with the abundance of cell states and more specifically illuminate causal processes of disease development. The final portion of this thesis introduces a tool that adapts our framework to flexibly detect cell state abundance associations to genetic variants at genome-wide scale. In a dataset of blood tissue, we reveal novel genotype-phenotype associations that offer clues about genetic mechanisms of immune-mediated disease risk.This work demonstrates the importance of modeling cell states in single-cell datasets at fine-grained resolution and the value in examining shared patterns of abundance across individuals. It is our hope that the methods produced by this work will empower researchers to unlock new insights from single-cell datasets, expanding our understanding of disease and ultimately improving human health and health care.
Identifying genetic variants that influence the abundance of cell states in single-cell data
Disease risk alleles influence the composition of cells present in the body, but modeling genetic effects on the cell states revealed by single-cell profiling is difficult because variant-associated states may reflect diverse combinations of the profiled cell features that are challenging to predefine. We introduce Genotype–Neighborhood Associations (GeNA), a statistical tool to identify cell-state abundance quantitative trait loci (csaQTLs) in high-dimensional single-cell datasets. Instead of testing associations to predefined cell states, GeNA flexibly identifies the cell states whose abundance is most associated with genetic variants. In a genome-wide survey of single-cell RNA sequencing peripheral blood profiling from 969 individuals, GeNA identifies five independent loci associated with shifts in the relative abundance of immune cell states. For example, rs3003-T ( P  = 1.96 × 10 −11 ) associates with increased abundance of natural killer cells expressing tumor necrosis factor response programs. This csaQTL colocalizes with increased risk for psoriasis, an autoimmune disease that responds to anti-tumor necrosis factor treatments. Flexibly characterizing csaQTLs for granular cell states may help illuminate how genetic background alters cellular composition to confer disease risk. GeNA identifies cell-state abundance quantitative trait loci (csaQTLs) in single-cell RNA sequencing data. Applied to OneK1K, GeNA identifies natural killer cell and myeloid csaQTLs and implicates interferon-α-related cell states using a polygenic risk score for systemic lupus erythematosus.
Covarying neighborhood analysis identifies cell populations associated with phenotypes of interest from single-cell transcriptomics
As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes like clinical phenotypes. Current statistical approaches typically map cells to clusters then assess differences in cluster abundance. We present covarying neighborhood analysis (CNA), an unbiased method to identify associated cell populations with greater flexibility than cluster-based approaches. CNA characterizes dominant axes of variation across samples by identifying groups of small regions in transcriptional space—termed neighborhoods—that covary in abundance across samples, suggesting shared function or regulation. CNA performs statistical testing for associations between any sample-level attribute and the abundances of these covarying neighborhood groups. Simulations show that CNA enables more sensitive and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, identifies monocyte populations expanded in sepsis, and identifies a novel T-cell population associated with progression to active tuberculosis.