Catalogue Search | MBRL

The data detective : ten easy rules to make sense of statistics

by Harford, Tim, 1973- author in Statistics Methodology , Social sciences Statistical methods

\"Today we think statistics are the enemy, numbers used to mislead and confuse us. That's a mistake, Tim Harford says in The Data Detective. We shouldn't be suspicious of statistics-we need to understand what they mean and how they can improve our lives: they are, at heart, human behavior seen through the prism of numbers and are often \"the only way of grasping much of what is going on around us.\" If we can toss aside our fears and learn to approach them clearly-understanding how our own preconceptions lead us astray-statistics can point to ways we can live better and work smarter. As \"perhaps the best popular economics writer in the world\" (New Statesman), Tim Harford is an expert at taking complicated ideas and untangling them for millions of readers. In The Data Detective, he uses new research in science and psychology to set out ten strategies for using statistics to erase our biases and replace them with new ideas that use virtues like patience, curiosity, and good sense to better understand ourselves and the world. As a result, The Data Detective is a big-idea book about statistics and human behavior that is fresh, unexpected, and insightful\"-- Provided by publisher.

Share this book

Add to My Shelf

From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline version 2; peer review: 5 approved

by Smyth, Gordon K , Lun, Aaron T. L , Chen, Yunshun in Genomics , Software Tool , Statistical Methodologies & Health Informatics

2016

In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.

Journal Article

Share this book

Add to My Shelf

The reviewer's guide to quantitative methods in the social sciences

by Hancock, Gregory R., editor , Stapleton, Laura M., editor , Mueller, Ralph O., editor in Social sciences Research Methodology. , Social sciences Statistical methods.

Book

Share this book

Add to My Shelf

Bioconductor workflow for microbiome data analysis: from raw reads to community analyses version 1; peer review: 3 approved

by Fukuyama, Julia A , Sankaran, Kris , McMurdie, Paul J in Bioinformatics , Microbial Evolution & Genomics , Microbiomes

2016

High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or microbial composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, including both parameteric and nonparametric methods. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests, partial least squares and linear models as well as nonparametric testing using community networks and the ggnetwork package.

Journal Article

Share this book

Add to My Shelf

Exercising essential statistics

by Berman, Evan M. author in Social sciences Statistical methods , Social sciences Statistics Methodology

2007

Book

Share this book

Add to My Shelf

Data Missing Not at Random in Mobile Health Research: Assessment of the Problem and a Case for Sensitivity Analyses

by Goldberg, Simon B , Davidson, Richard J , Bolt, Daniel M in Bias , Biometry , Humans

2021

Missing data are common in mobile health (mHealth) research. There has been little systematic investigation of how missingness is handled statistically in mHealth randomized controlled trials (RCTs). Although some missing data patterns (ie, missing at random [MAR]) may be adequately addressed using modern missing data methods such as multiple imputation and maximum likelihood techniques, these methods do not address bias when data are missing not at random (MNAR). It is typically not possible to determine whether the missing data are MAR. However, higher attrition in active (ie, intervention) versus passive (ie, waitlist or no treatment) conditions in mHealth RCTs raise a strong likelihood of MNAR, such as if active participants who benefit less from the intervention are more likely to drop out. This study aims to systematically evaluate differential attrition and methods used for handling missingness in a sample of mHealth RCTs comparing active and passive control conditions. We also aim to illustrate a modern model-based sensitivity analysis and a simpler fixed-value replacement approach that can be used to evaluate the influence of MNAR. We reanalyzed attrition rates and predictors of differential attrition in a sample of 36 mHealth RCTs drawn from a recent meta-analysis of smartphone-based mental health interventions. We systematically evaluated the design features related to missingness and its handling. Data from a recent mHealth RCT were used to illustrate 2 sensitivity analysis approaches (pattern-mixture model and fixed-value replacement approach). Attrition in active conditions was, on average, roughly twice that of passive controls. Differential attrition was higher in larger studies and was associated with the use of MAR-based multiple imputation or maximum likelihood methods. Half of the studies (18/36, 50%) used these modern missing data techniques. None of the 36 mHealth RCTs reviewed conducted a sensitivity analysis to evaluate the possible consequences of data MNAR. A pattern-mixture model and fixed-value replacement sensitivity analysis approaches were introduced. Results from a recent mHealth RCT were shown to be robust to missing data, reflecting worse outcomes in missing versus nonmissing scores in some but not all scenarios. A review of such scenarios helps to qualify the observations of significant treatment effects. MNAR data because of differential attrition are likely in mHealth RCTs using passive controls. Sensitivity analyses are recommended to allow researchers to assess the potential impact of MNAR on trial results.

Journal Article

Share this book

Add to My Shelf

Multilevel modeling for social and personality psychology

by Nezlek, John B. (John Bruce), 1952- in Social psychology Research Methodology. , Personality Research Methodology. , Social psychology Statistical methods.

Book

Share this book

Add to My Shelf

Predicting relapse or recurrence of depression: systematic review of prognostic models

by Meader, Nicholas , Gilbody, Simon , Phillips, Robert S. in Adult , Adults , Best practice

2022

Relapse and recurrence of depression are common, contributing to the overall burden of depression globally. Accurate prediction of relapse or recurrence while patients are well would allow the identification of high-risk individuals and may effectively guide the allocation of interventions to prevent relapse and recurrence. To review prognostic models developed to predict the risk of relapse, recurrence, sustained remission, or recovery in adults with remitted major depressive disorder. We searched the Cochrane Library (current issue); Ovid MEDLINE (1946 onwards); Ovid Embase (1980 onwards); Ovid PsycINFO (1806 onwards); and Web of Science (1900 onwards) up to May 2021. We included development and external validation studies of multivariable prognostic models. We assessed risk of bias of included studies using the Prediction model risk of bias assessment tool (PROBAST). We identified 12 eligible prognostic model studies (11 unique prognostic models): 8 model development-only studies, 3 model development and external validation studies and 1 external validation-only study. Multiple estimates of performance measures were not available and meta-analysis was therefore not necessary. Eleven out of the 12 included studies were assessed as being at high overall risk of bias and none examined clinical utility. Due to high risk of bias of the included studies, poor predictive performance and limited external validation of the models identified, presently available clinical prediction models for relapse and recurrence of depression are not yet sufficiently developed for deploying in clinical settings. There is a need for improved prognosis research in this clinical area and future studies should conform to best practice methodological and reporting guidelines.

Journal Article

Share this book

Add to My Shelf

The SAGE dictionary of statistics & methodology : a nontechnical guide for the social sciences

by Vogt, W. Paul , Johnson, Burke in Social sciences Statistical methods Dictionaries. , Social sciences Methodology Dictionaries.

Book

Share this book

Add to My Shelf

Stepped wedge cluster randomised trials: a review of the statistical methodology used and available

by Campbell, M. J. , McElduff, P. , Barker, D. in Clinical trials , Cluster Analysis , Cluster randomised

2016

Background Previous reviews have focussed on the rationale for employing the stepped wedge design (SWD), the areas of research to which the design has been applied and the general characteristics of the design. However these did not focus on the statistical methods nor addressed the appropriateness of sample size methods used.This was a review of the literature of the statistical methodology used in stepped wedge cluster randomised trials. Methods Literature Review. The Medline, Embase, PsycINFO, CINAHL and Cochrane databases were searched for methodological guides and RCTs which employed the stepped wedge design. Results This review identified 102 trials which employed the stepped wedge design compared to 37 from the most recent review by Beard et al. 2015. Forty six trials were cohort designs and 45 % ( n = 46) had fewer than 10 clusters. Of the 42 articles discussing the design methodology 10 covered analysis and seven covered sample size. For cohort stepped wedge designs there was only one paper considering analysis and one considering sample size methods. Most trials employed either a GEE or mixed model approach to analysis ( n = 77) but only 22 trials (22 %) estimated sample size in a way which accounted for the stepped wedge design that was subsequently used. Conclusions Many studies which employ the stepped wedge design have few clusters but use methods of analysis which may require more clusters for unbiased and efficient intervention effect estimates. There is the need for research on the minimum number of clusters required for both types of stepped wedge design. Researchers should distinguish in the sample size calculation between cohort and cross sectional stepped wedge designs. Further research is needed on the effect of adjusting for the potential confounding of time on the study power.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter