Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
134 result(s) for "Yang, Jean Y. H."
Sort by:
Benchmarking clustering algorithms on estimating the number of cell types from single-cell RNA-sequencing data
Background A key task in single-cell RNA-seq (scRNA-seq) data analysis is to accurately detect the number of cell types in the sample, which can be critical for downstream analyses such as cell type identification. Various scRNA-seq data clustering algorithms have been specifically designed to automatically estimate the number of cell types through optimising the number of clusters in a dataset. The lack of benchmark studies, however, complicates the choice of the methods. Results We systematically benchmark a range of popular clustering algorithms on estimating the number of cell types in a variety of settings by sampling from the Tabula Muris data to create scRNA-seq datasets with a varying number of cell types, varying number of cells in each cell type, and different cell type proportions. The large number of datasets enables us to assess the performance of the algorithms, covering four broad categories of approaches, from various aspects using a panel of criteria. We further cross-compared the performance on datasets with high cell numbers using Tabula Muris and Tabula Sapiens data. Conclusions We identify the strengths and weaknesses of each method on multiple criteria including the deviation of estimation from the true number of cell types, variability of estimation, clustering concordance of cells to their predefined cell types, and running time and peak memory usage. We then summarise these results into a multi-aspect recommendation to the users. The proposed stability-based approach for estimating the number of cell types is implemented in an R package and is freely available from ( https://github.com/PYangLab/scCCESS ).
Transcriptional downregulation of MHC class I and melanoma de- differentiation in resistance to PD-1 inhibition
Transcriptomic signatures designed to predict melanoma patient responses to PD-1 blockade have been reported but rarely validated. We now show that intra-patient heterogeneity of tumor responses to PD-1 inhibition limit the predictive performance of these signatures. We reasoned that resistance mechanisms will reflect the tumor microenvironment, and thus we examined PD-1 inhibitor resistance relative to T-cell activity in 94 melanoma tumors collected at baseline and at time of PD-1 inhibitor progression. Tumors were analyzed using RNA sequencing and flow cytometry, and validated functionally. These analyses confirm that major histocompatibility complex (MHC) class I downregulation is a hallmark of resistance to PD-1 inhibitors and is associated with the MITF low /AXL high de-differentiated phenotype and cancer-associated fibroblast signatures. We demonstrate that TGFß drives the treatment resistant phenotype (MITF low /AXL high ) and contributes to MHC class I downregulation in melanoma. Combinations of anti-PD-1 with drugs that target the TGFß signaling pathway and/or which reverse melanoma de-differentiation may be effective future therapeutic strategies. A significant proportion of patients develop innate or acquired resistance to immune checkpoint inhibitors. Here, the authors show that resistance to anti-PD-1 blockade is associated with TGF-beta driven major histocompatibility complex I (MHCI) down-regulation and a de-differentiated phenotype in melanoma patients.
BIDCell: Biologically-informed self-supervised learning for segmentation of subcellular spatial transcriptomics data
Recent advances in subcellular imaging transcriptomics platforms have enabled high-resolution spatial mapping of gene expression, while also introducing significant analytical challenges in accurately identifying cells and assigning transcripts. Existing methods grapple with cell segmentation, frequently leading to fragmented cells or oversized cells that capture contaminated expression. To this end, we present BIDCell, a self-supervised deep learning-based framework with biologically-informed loss functions that learn relationships between spatially resolved gene expression and cell morphology. BIDCell incorporates cell-type data, including single-cell transcriptomics data from public repositories, with cell morphology information. Using a comprehensive evaluation framework consisting of metrics in five complementary categories for cell segmentation performance, we demonstrate that BIDCell outperforms other state-of-the-art methods according to many metrics across a variety of tissue types and technology platforms. Our findings underscore the potential of BIDCell to significantly enhance single-cell spatial expression analyses, enabling great potential in biological discovery. Subcellular in situ spatial transcriptomics offers the promise to address biological problems that were previously inaccessible but requires accurate cell segmentation to uncover insights. Here, authors present BIDCell, a biologically informed, deep learning-based cell segmentation framework.
Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2
The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies. We have generalized scMerge2 to enable the merging of millions of cells from single-cell studies generated by various single-cell technologies. Using a large COVID-19 data collection with over five million cells from 1000+ individuals, we demonstrate that scMerge2 enables multi-sample multi-condition scRNA-seq data integration from multiple cohorts and reveals signatures derived from cell-type expression that are more accurate in discriminating disease progression. Further, we demonstrate that scMerge2 can remove dataset variability in CyTOF, imaging mass cytometry and CITE-seq experiments, demonstrating its applicability to a broad spectrum of single-cell profiling technologies. Recent advances in multi-condition single-cell multi-cohort studies enable exploration of diverse cell states. Here, authors present scMerge2, an algorithm that allows integration of a large COVID-19 data collection with over five million cells to uncover distinct signatures of disease progression.
Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis
Background Single-cell RNA-sequencing (scRNA-seq) is a transformative technology, allowing global transcriptomes of individual cells to be profiled with high accuracy. An essential task in scRNA-seq data analysis is the identification of cell types from complex samples or tissues profiled in an experiment. To this end, clustering has become a key computational technique for grouping cells based on their transcriptome profiles, enabling subsequent cell type identification from each cluster of cells. Due to the high feature-dimensionality of the transcriptome (i.e. the large number of measured genes in each cell) and because only a small fraction of genes are cell type-specific and therefore informative for generating cell type-specific clusters, clustering directly on the original feature/gene dimension may lead to uninformative clusters and hinder correct cell type identification. Results Here, we propose an autoencoder-based cluster ensemble framework in which we first take random subspace projections from the data, then compress each random projection to a low-dimensional space using an autoencoder artificial neural network, and finally apply ensemble clustering across all encoded datasets to generate clusters of cells. We employ four evaluation metrics to benchmark clustering performance and our experiments demonstrate that the proposed autoencoder-based cluster ensemble can lead to substantially improved cell type-specific clusters when applied with both the standard k -means clustering algorithm and a state-of-the-art kernel-based clustering algorithm (SIMLR) designed specifically for scRNA-seq data. Compared to directly using these clustering algorithms on the original datasets, the performance improvement in some cases is up to 100%, depending on the evaluation metric used. Conclusions Our results suggest that the proposed framework can facilitate more accurate cell type identification as well as other downstream analyses. The code for creating the proposed autoencoder-based cluster ensemble framework is freely available from https://github.com/gedcom/scCCESS
Scalable workflow for characterization of cell-cell communication in COVID-19 patients
COVID-19 patients display a wide range of disease severity, ranging from asymptomatic to critical symptoms with high mortality risk. Our ability to understand the interaction of SARS-CoV-2 infected cells within the lung, and of protective or dysfunctional immune responses to the virus, is critical to effectively treat these patients. Currently, our understanding of cell-cell interactions across different disease states, and how such interactions may drive pathogenic outcomes, is incomplete. Here, we developed a generalizable and scalable workflow for identifying cells that are differentially interacting across COVID-19 patients with distinct disease outcomes and use this to examine eight public single-cell RNA-seq datasets (six from peripheral blood mononuclear cells, one from bronchoalveolar lavage and one from nasopharyngeal), with a total of 211 individual samples. By characterizing the cell-cell interaction patterns across epithelial and immune cells in lung tissues for patients with varying disease severity, we illustrate diverse communication patterns across individuals, and discover heterogeneous communication patterns among moderate and severe patients. We further illustrate patterns derived from cell-cell interactions are potential signatures for discriminating between moderate and severe patients. Overall, this workflow can be generalized and scaled to combine multiple scRNA-seq datasets to uncover cell-cell interactions.
Single-cell RNA-Seq analysis reveals dynamic trajectories during mouse liver development
Background The differentiation and maturation trajectories of fetal liver stem/progenitor cells (LSPCs) are not fully understood at single-cell resolution, and a priori knowledge of limited biomarkers could restrict trajectory tracking. Results We employed marker-free single-cell RNA-Seq to characterize comprehensive transcriptional profiles of 507 cells randomly selected from seven stages between embryonic day 11.5 and postnatal day 2.5 during mouse liver development, and also 52 Epcam-positive cholangiocytes from postnatal day 3.25 mouse livers. LSPCs in developing mouse livers were identified via marker-free transcriptomic profiling. Single-cell resolution dynamic developmental trajectories of LSPCs exhibited contiguous but discrete genetic control through transcription factors and signaling pathways. The gene expression profiles of cholangiocytes were more close to that of embryonic day 11.5 rather than other later staged LSPCs, cuing the fate decision stage of LSPCs. Our marker-free approach also allows systematic assessment and prediction of isolation biomarkers for LSPCs. Conclusions Our data provide not only a valuable resource but also novel insights into the fate decision and transcriptional control of self-renewal, differentiation and maturation of LSPCs.
Clonal evolution in liver cancer at single-cell and single-variant resolution
Genetic heterogeneity of tumor is closely related to its clonal evolution, phenotypic diversity and treatment resistance, and such heterogeneity has only been characterized at single-cell sub-chromosomal scale in liver cancer. Here we reconstructed the single-variant resolution clonal evolution in human liver cancer based on single-cell mutational profiles. The results indicated that key genetic events occurred early during tumorigenesis, and an early metastasis followed by independent evolution was observed in primary liver tumor and intrahepatic metastatic portal vein tumor thrombus. By parallel single-cell RNA-Seq, the transcriptomic phenotype of HCC was found to be related with genetic heterogeneity. For the first time we reconstructed the single-cell and single-variant clonal evolution in human liver cancer, and dissection of both genetic and phenotypic heterogeneity will facilitate better understanding of their relationship.
scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning
Single-cell multiomics data continues to grow at an unprecedented pace. Although several methods have demonstrated promising results in integrating several data modalities from the same tissue, the complexity and scale of data compositions present in cell atlases still pose a challenge. Here, we present scJoint, a transfer learning method to integrate atlas-scale, heterogeneous collections of scRNA-seq and scATAC-seq data. scJoint leverages information from annotated scRNA-seq data in a semisupervised framework and uses a neural network to simultaneously train labeled and unlabeled data, allowing label transfer and joint visualization in an integrative framework. Using atlas data as well as multimodal datasets generated with ASAP-seq and CITE-seq, we demonstrate that scJoint is computationally efficient and consistently achieves substantially higher cell-type label accuracy than existing methods while providing meaningful joint visualizations. Thus, scJoint overcomes the heterogeneity of different data modalities to enable a more comprehensive understanding of cellular phenotypes. Integration of data from single-cell RNA-seq and ATAC-seq is achieved with transfer learning.
LC-N2G: a local consistency approach for nutrigenomics data analysis
Background Nutrigenomics aims at understanding the interaction between nutrition and gene information. Due to the complex interactions of nutrients and genes, their relationship exhibits non-linearity. One of the most effective and efficient methods to explore their relationship is the nutritional geometry framework which fits a response surface for the gene expression over two prespecified nutrition variables. However, when the number of nutrients involved is large, it is challenging to find combinations of informative nutrients with respect to a certain gene and to test whether the relationship is stronger than chance. Methods for identifying informative combinations are essential to understanding the relationship between nutrients and genes. Results We introduce Local Consistency Nutrition to Graphics (LC-N2G), a novel approach for ranking and identifying combinations of nutrients with gene expression. In LC-N2G, we first propose a model-free quantity called Local Consistency statistic to measure whether there is non-random relationship between combinations of nutrients and gene expression measurements based on (1) the similarity between samples in the nutrient space and (2) their difference in gene expression. Then combinations with small LC are selected and a permutation test is performed to evaluate their significance. Finally, the response surfaces are generated for the subset of significant relationships. Evaluation on simulated data and real data shows the LC-N2G can accurately find combinations that are correlated with gene expression. Conclusion The LC-N2G is practically powerful for identifying the informative nutrition variables correlated with gene expression. Therefore, LC-N2G is important in the area of nutrigenomics for understanding the relationship between nutrition and gene expression information.