Catalogue Search | MBRL

Inferring population dynamics from single-cell RNA-sequencing time series data

by Hasenauer, Jan , Genga Ryan M J , Fiedler, Anna K in Apoptosis , Beta cells , Cell culture

2019

Recent single-cell RNA-sequencing studies have suggested that cells follow continuous transcriptomic trajectories in an asynchronous fashion during development. However, observations of cell flux along trajectories are confounded with population size effects in snapshot experiments and are therefore hard to interpret. In particular, changes in proliferation and death rates can be mistaken for cell flux. Here we present pseudodynamics, a mathematical framework that reconciles population dynamics with the concepts underlying developmental trajectories inferred from time-series single-cell data. Pseudodynamics models population distribution shifts across trajectories to quantify selection pressure, population expansion, and developmental potentials. Applying this model to time-resolved single-cell RNA-sequencing of T-cell and pancreatic beta cell maturation, we characterize proliferation and apoptosis rates and identify key developmental checkpoints, data inaccessible to existing approaches.A new computational method allows key developmental checkpoints and important parameters of population dynamics to be inferred from single-cell RNA-sequencing time series data.

Journal Article

Share this book

Add to My Shelf

Integration of single-cell transcriptomes and chromatin landscapes reveals regulatory programs driving pharyngeal organ development

by Aliee, Hananeh , Maehr, René , Kernfeld, Eric M. in 14/32 , 38/39 , 45/91

2022

Maldevelopment of the pharyngeal endoderm, an embryonic tissue critical for patterning of the pharyngeal region and ensuing organogenesis, ultimately contributes to several classes of human developmental syndromes and disorders. Such syndromes are characterized by a spectrum of phenotypes that currently cannot be fully explained by known mutations or genetic variants due to gaps in characterization of critical drivers of normal and dysfunctional development. Despite the disease-relevance of pharyngeal endoderm, we still lack a comprehensive and integrative view of the molecular basis and gene regulatory networks driving pharyngeal endoderm development. To close this gap, we apply transcriptomic and chromatin accessibility single-cell sequencing technologies to generate a multi-omic developmental resource spanning pharyngeal endoderm patterning to the emergence of organ-specific epithelia in the developing mouse embryo. We identify cell-type specific gene regulation, distill GRN models that define developing organ domains, and characterize the role of an immunodeficiency-associated forkhead box transcription factor. The molecular basis and gene regulatory networks driving pharyngeal endoderm development remain poorly understood. Here the authors report single cell transcriptomic and chromatin landscapes to delineate regulatory programs driving this process and to define the immunodeficiency-associated developmental defects resulting from Foxn1 dysfunction.

Journal Article

Share this book

Add to My Shelf

A comparison of computational methods for expression forecasting

by Yang, Yunxiao , Battle, Alexis , Weinstock, Joshua S. in Animal Genetics and Genomics , Benchmarks v2.0 , Bioinformatics

2025

Diverse machine learning methods promise to forecast gene expression changes in response to novel genetic perturbations. However, these methods’ accuracy is not well characterized. We created a benchmarking platform that combines a panel of 11 large-scale perturbation datasets with an expression forecasting software engine that encompasses or interfaces to a wide variety of methods. We used our platform to assess methods, parameters, and sources of auxiliary data, finding that it is uncommon for expression forecasting methods to outperform simple baselines. Our platform will serve as a resource to improve methods and to identify contexts in which expression forecasting can succeed.

Journal Article

Share this book

Add to My Shelf

A systematic comparison of computational methods for expression forecasting

by Yang, Yunxiao , Battle, Alexis , Cahan, Patrick in Bioinformatics

2024

Expression forecasting methods use machine learning models to predict how a cell will alter its transcriptome upon perturbation. Such methods are enticing because they promise to answer pressing questions in fields ranging from developmental genetics to cell fate engineering and because they are a fast, cheap, and accessible complement to the corresponding experiments. However, the absolute and relative accuracy of these methods is poorly characterized, limiting their informed use, their improvement, and the interpretation of their predictions. To address these issues, we created a benchmarking platform that combines a panel of 11 large-scale perturbation datasets with an expression forecasting software engine that encompasses or interfaces to a wide variety of methods. We used our platform to systematically assess methods, parameters, and sources of auxiliary data, finding that performance strongly depends on the choice of metric, and especially for simple metrics like mean squared error, it is uncommon for expression forecasting methods to out-perform simple baselines. Our platform will serve as a resource to improve methods and to identify contexts in which expression forecasting can succeed.

Journal Article

Share this book

Add to My Shelf

Model-X knockoffs reveal data-dependent limits on regulatory network identification

by Battle, Alexis , Cahan, Patrick , Keener, Rebecca in Bioinformatics

2023

Computational biologists have long sought to automatically infer transcriptional regulatory networks (TRNs) from gene expression data, but such approaches notoriously suffer from false positives. Two points of failure could yield false positives: faulty hypothesis testing, or erroneous assumption of a classic criterion called causal sufficiency. We show that a recent statistical development, model-X knockoffs, can effectively control false positives in tests of conditional independence in mouse and E. coli data, which rules out faulty hypothesis tests. Yet, benchmarking against ChIP and other gold standards reveals highly inflated false discovery rates. This identifies the causal sufficiency assumption as a key limiting factor in TRN inference.

Paper

Share this book

Add to My Shelf

Group-Invariant Subspace Clustering

by Shuchin Aeron , Kernfeld, Eric in Algorithms , Clustering , Data points

2015

In this paper we consider the problem of group invariant subspace clustering where the data is assumed to come from a union of group-invariant subspaces of a vector space, i.e. subspaces which are invariant with respect to action of a given group. Algebraically, such group-invariant subspaces are also referred to as submodules. Similar to the well known Sparse Subspace Clustering approach where the data is assumed to come from a union of subspaces, we analyze an algorithm which, following a recent work [1], we refer to as Sparse Sub-module Clustering (SSmC). The method is based on finding group-sparse self-representation of data points. In this paper we primarily derive general conditions under which such a group-invariant subspace identification is possible. In particular we extend the geometric analysis in [2] and in the process we identify a related problem in geometric functional analysis.

Paper

Share this book

Add to My Shelf

Clustering multi-way data: a novel algebraic approach

by Shuchin Aeron , Kilmer, Misha , Kernfeld, Eric in Algorithms , Clustering , Data points

2015

In this paper, we develop a method for unsupervised clustering of two-way (matrix) data by combining two recent innovations from different fields: the Sparse Subspace Clustering (SSC) algorithm [10], which groups points coming from a union of subspaces into their respective subspaces, and the t-product [18], which was introduced to provide a matrix-like multiplication for third order tensors. Our algorithm is analogous to SSC in that an \"affinity\" between different data points is built using a sparse self-representation of the data. Unlike SSC, we employ the t-product in the self-representation. This allows us more flexibility in modeling; infact, SSC is a special case of our method. When using the t-product, three-way arrays are treated as matrices whose elements (scalars) are n-tuples or tubes. Convolutions take the place of scalar multiplication. This framework allows us to embed the 2-D data into a vector-space-like structure called a free module over a commutative ring. These free modules retain many properties of complex inner-product spaces, and we leverage that to provide theoretical guarantees on our algorithm. We show that compared to vector-space counterparts, SSmC achieves higher accuracy and better able to cluster data with less preprocessing in some image clustering problems. In particular we show the performance of the proposed method on Weizmann face database, the Extended Yale B Face database and the MNIST handwritten digits database.

Paper

Share this book

Add to My Shelf

Multilinear Subspace Clustering

by Shuchin Aeron , Kilmer, Misha , Majumder, Nathan in Algorithms , Clustering , Image segmentation

2015

In this paper we present a new model and an algorithm for unsupervised clustering of 2-D data such as images. We assume that the data comes from a union of multilinear subspaces (UOMS) model, which is a specific structured case of the much studied union of subspaces (UOS) model. For segmentation under this model, we develop Multilinear Subspace Clustering (MSC) algorithm and evaluate its performance on the YaleB and Olivietti image data sets. We show that MSC is highly competitive with existing algorithms employing the UOS model in terms of clustering performance while enjoying improvement in computational complexity.

Paper

Share this book

Add to My Shelf

Beyond pseudotime: Following T-cell maturation in single-cell RNAseq time series

by Hasenauer, Jan , Maehr, Rene , Fiedler, Anna K in Bioinformatics , Cell differentiation , Gene expression

2017

Cellular development has traditionally been described as a series of transitions between discrete cell states, such as the sequence of double negative, double positive and single positive stages in T-cell development. Recent advances in single cell transcriptomics suggest an alternative description of development, in which cells follow continuous transcriptomic trajectories. A cell's state along such a trajectory can be captured with pseudotemporal ordering, which however is not able to predict development of the system in real time. We present pseudodynamics, a mathematical framework that integrates time-series and genetic knock-out information with such transcriptome-based descriptions in order to describe and analyze the real-time evolution of the system. Pseudodynamics models the distribution of a cell population across a continuous cell state coordinate over time based on a stochastic differential equation along developmental trajectories and random switching between trajectories in branching regions. To illustrate feasibility, we use pseudodynamics to estimate cell-state-dependent growth and differentiation of thymic T-cell development. The model approximates a developmental potential function (Waddington's landscape) and suggests that thymic T-cell development is biphasic and not strictly deterministic before beta-selection. Pseudodynamics generalizes classical discrete population models to continuous states and thus opens possibilities such as probabilistic model selection to single cell genomics.

Paper

Share this book

Add to My Shelf