Catalogue Search | MBRL

Simple but powerful interactive data analysis in R with R/LinekdCharts

by Ovchinnikova, Svetlana , Anders, Simon in Bioinformatics , Biological Assay , Data Analysis

2024

In research involving data-rich assays, exploratory data analysis is a crucial step. Typically, this involves jumping back and forth between visualizations that provide overview of the whole data and others that dive into details. For example, it might be helpful to have one chart showing a summary statistic for all samples, while a second chart provides details for points selected in the first chart. We present R/LinkedCharts, a framework that renders this task radically simple, requiring very few lines of code to obtain complex and general visualization, which later can be polished to provide interactive data access of publication quality.

Journal Article

Share this book

Add to My Shelf

Simple but powerful interactive data analysis in R with R/LinkedCharts

by Ovchinnikova, Svetlana , Anders, Simon in Animal Genetics and Genomics , Bioinformatics , Biomedical and Life Sciences

2024

In research involving data-rich assays, exploratory data analysis is a crucial step. Typically, this involves jumping back and forth between visualizations that provide overview of the whole data and others that dive into details. For example, it might be helpful to have one chart showing a summary statistic for all samples, while a second chart provides details for points selected in the first chart. We present R/LinkedCharts, a framework that renders this task radically simple, requiring very few lines of code to obtain complex and general visualization, which later can be polished to provide interactive data access of publication quality.

Journal Article

Share this book

Add to My Shelf

EpiSmokEr: a robust classifier to determine smoking status from DNA methylation data

by Anders, Simon , Korhonen, Tellervo , Kaprio, Jaakko in Adult , Aged , Artificial intelligence

2019

Smoking strongly influences DNA methylation, with current and never smokers exhibiting different methylation profiles. To advance the practical applicability of the smoking-associated methylation signals, we used machine learning methodology to train a classifier for smoking status prediction. We show the prediction performance of our classifier on three independent whole-blood datasets demonstrating its robustness and global applicability. Furthermore, we examine the reasons for biologically meaningful misclassifications through comprehensive phenotypic evaluation. The major contribution of our classifier is its global applicability without a need for users to determine a threshold value for each dataset to predict the smoking status. We provide an R package, EpiSmokEr (Epigenetic Smoking status Estimator), facilitating the use of our classifier to predict smoking status in future studies.

Journal Article

Share this book

Add to My Shelf

Dissecting intratumour heterogeneity of nodal B-cell lymphomas at the transcriptional, genetic and drug-response levels

by Simon, Anders , Hundemer, Michael , Uvarovskii Alexey in B-cell lymphoma , Cancer , Cancer therapies

2020

Tumour heterogeneity encompasses both the malignant cells and their microenvironment. While heterogeneity between individual patients is known to affect the efficacy of cancer therapy, most personalized treatment approaches do not account for intratumour heterogeneity. We addressed this issue by studying the heterogeneity of nodal B-cell lymphomas by single-cell RNA-sequencing and transcriptome-informed flow cytometry. We identified transcriptionally distinct malignant subpopulations and compared their drug-response and genomic profiles. Malignant subpopulations from the same patient responded strikingly differently to anti-cancer drugs ex vivo, which recapitulated subpopulation-specific drug sensitivity during in vivo treatment. Infiltrating T cells represented the majority of non-malignant cells, whose gene-expression signatures were similar across all donors, whereas the frequencies of T-cell subsets varied significantly between the donors. Our data provide insights into the heterogeneity of nodal B-cell lymphomas and highlight the relevance of intratumour heterogeneity for personalized cancer therapy.Roider et al. combine scRNA-seq and transcriptome-informed flow cytometry, and uncover transcriptionally different malignant subclones with distinct drug responses and T-cell profiles in B-cell non-Hodgkin lymphoma.

Journal Article

Share this book

Add to My Shelf

Focused multidimensional scaling: interactive visualization for exploration of high-dimensional data

by Urpa, Lea M. , Anders, Simon in Algorithms , Animals , Bioinformatics

2019

Background Visualization is an important tool for generating meaning from scientific data, but the visualization of structures in high-dimensional data (such as from high-throughput assays) presents unique challenges. Dimension reduction methods are key in solving this challenge, but these methods can be misleading- especially when apparent clustering in the dimension-reducing representation is used as the basis for reasoning about relationships within the data. Results We present two interactive visualization tools, distnet and focusedMDS, that help in assessing the validity of a dimension-reducing plot and in interactively exploring relationships between objects in the data. The distnet tool is used to examine discrepancies between the placement of points in a two dimensional visualization and the points’ actual similarities in feature space. The focusedMDS tool is an intuitive, interactive multidimensional scaling tool that is useful for exploring the relationships of one particular data point to the others, that might be useful in a personalized medicine framework. Conclusions We introduce here two freely available tools for visually exploring and verifying the validity of dimension-reducing visualizations and biological information gained from these. The use of such tools can confirm that conclusions drawn from dimension-reducing visualizations are not simply artifacts of the visualization method, but are real biological insights.

Journal Article

Share this book

Add to My Shelf

A computational method for detection of ligand-binding proteins from dose range thermal proteome profiles

by Bantscheff, Marcus , Savitski, Mikhail M. , Kurzawa, Nils in 631/114/2415 , 631/114/2784 , 631/1647/48

2020

Detecting ligand-protein interactions in living cells is a fundamental challenge in molecular biology and drug research. Proteome-wide profiling of thermal stability as a function of ligand concentration promises to tackle this challenge. However, current data analysis strategies use preset thresholds that can lead to suboptimal sensitivity/specificity tradeoffs and limited comparability across datasets. Here, we present a method based on statistical hypothesis testing on curves, which provides control of the false discovery rate. We apply it to several datasets probing epigenetic drugs and a metabolite. This leads us to detect off-target drug engagement, including the finding that the HDAC8 inhibitor PCI-34051 and its analog BRD-3811 bind to and inhibit leucine aminopeptidase 3. An implementation is available as an R package from Bioconductor ( https://bioconductor.org/packages/TPP2D ). We hope that our method will facilitate prioritizing targets from thermal profiling experiments. 2D-thermal proteome profiling (2D-TPP) is a powerful assay for probing interactions of proteins with small molecules in their native context. Here the authors provide a statistical method for false discovery rate controlled analysis for 2D-TPP applications.

Journal Article

Share this book

Add to My Shelf

RNA-Seq workflow: gene-level exploratory analysis and differential expression

by Anders, Simon , Kim, Vladislav , Huber, Wolfgang in Bioinformatics , Genomics

2016

Here we walk through an end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were aligned to the reference genome, and prepare a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample.We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results.

Journal Article

Share this book

Add to My Shelf

Count-based differential expression analysis of RNA sequencing data using R and Bioconductor

by Smyth, Gordon K , Huber, Wolfgang , Chen, Yunshun in 38/91 , 631/114/2415 , 631/1647/514/1949

2013

RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4–10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.

Journal Article

Share this book

Add to My Shelf

Transcriptome and translatome co-evolution in mammals

by Leushkin, Evgeny , Brüning, Thoomke , Anders, Simon in 38/39 , 38/90 , 38/91

2020

Gene-expression programs define shared and species-specific phenotypes, but their evolution remains largely uncharacterized beyond the transcriptome layer 1 . Here we report an analysis of the co-evolution of translatomes and transcriptomes using ribosome-profiling and matched RNA-sequencing data for three organs (brain, liver and testis) in five mammals (human, macaque, mouse, opossum and platypus) and a bird (chicken). Our within-species analyses reveal that translational regulation is widespread in the different organs, in particular across the spermatogenic cell types of the testis. The between-species divergence in gene expression is around 20% lower at the translatome layer than at the transcriptome layer owing to extensive buffering between the expression layers, which especially preserved old, essential and housekeeping genes. Translational upregulation specifically counterbalanced global dosage reductions during the evolution of sex chromosomes and the effects of meiotic sex-chromosome inactivation during spermatogenesis. Despite the overall prevalence of buffering, some genes evolved faster at the translatome layer—potentially indicating adaptive changes in expression; testis tissue shows the highest fraction of such genes. Further analyses incorporating mass spectrometry proteomics data establish that the co-evolution of transcriptomes and translatomes is reflected at the proteome layer. Together, our work uncovers co-evolutionary patterns and associated selective forces across the expression layers, and provides a resource for understanding their interplay in mammalian organs. An analysis using ribosome-profiling and matched RNA-sequencing data for three organs across five mammalian species and a bird enables the comparison of translatomes and transcriptomes, revealing patterns of co-evolution of these two expression layers.

Journal Article

Share this book

Add to My Shelf

Accounting for technical noise in single-cell RNA-seq experiments

by Proserpio, Valentina , Marioni, John C , Teichmann, Sarah A in 631/114/2415 , 631/208/199 , 631/449/1659

2013

A statistical method that uses spike-ins to model the dependence of technical noise on transcript abundance in single-cell RNA-seq experiments allows identification of genes wherein observed variability in read counts can be reliably interpreted as a signal of biological variability as opposed to the effect of technical noise. Single-cell RNA-seq can yield valuable insights about the variability within a population of seemingly homogeneous cells. We developed a quantitative statistical method to distinguish true biological variability from the high levels of technical noise in single-cell experiments. Our approach quantifies the statistical significance of observed cell-to-cell variability in expression strength on a gene-by-gene basis. We validate our approach using two independent data sets from Arabidopsis thaliana and Mus musculus .

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter