Catalogue Search | MBRL

by McShane, Blakeley B. , Gelman, Andrew , Gal, David in Biomedicine , Computer Science , Confidence intervals

2019

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm-and the p-value thresholds intrinsic to it-as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to \"ban\" p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly.

Journal Article

Share this book

Add to My Shelf

General Forms of Finite Population Central Limit Theorems with Applications to Causal Inference

by Ding, Peng , Li, Xinran in Asymptotic methods , Asymptotic properties , Causality

2017

Frequentists' inference often delivers point estimators associated with confidence intervals or sets for parameters of interest. Constructing the confidence intervals or sets requires understanding the sampling distributions of the point estimators, which, in many but not all cases, are related to asymptotic Normal distributions ensured by central limit theorems. Although previous literature has established various forms of central limit theorems for statistical inference in super population models, we still need general and convenient forms of central limit theorems for some randomization-based causal analyses of experimental data, where the parameters of interests are functions of a finite population and randomness comes solely from the treatment assignment. We use central limit theorems for sample surveys and rank statistics to establish general forms of the finite population central limit theorems that are particularly useful for proving asymptotic distributions of randomization tests under the sharp null hypothesis of zero individual causal effects, and for obtaining the asymptotic repeated sampling distributions of the causal effect estimators. The new central limit theorems hold for general experimental designs with multiple treatment levels, multiple treatment factors and vector outcomes, and are immediately applicable for studying the asymptotic properties of many methods in causal inference, including instrumental variable, regression adjustment, rerandomization, cluster-randomized experiments, and so on. Previously, the asymptotic properties of these problems are often based on heuristic arguments, which in fact rely on general forms of finite population central limit theorems that have not been established before. Our new theorems fill this gap by providing more solid theoretical foundation for asymptotic randomization-based causal inference. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

A Paradox from Randomization-Based Causal Inference

by Ding, Peng in Causality , Factorial experiments , Fisher, Ronald Aylmer

2017

Under the potential outcomes framework, causal effects are defined as comparisons between potential outcomes under treatment and control. To infer causal effects from randomized experiments, Neyman proposed to test the null hypothesis of zero average causal effect (Neyman's null), and Fisher proposed to test the null hypothesis of zero individual causal effect (Fisher's null). Although the subtle difference between Neyman's null and Fisher's null has caused a lot of controversies and confusions for both theoretical and practical statisticians, a careful comparison between the two approaches has been lacking in the literature for more than eighty years. We fill this historical gap by making a theoretical comparison between them and highlighting an intriguing paradox that has not been recognized by previous researchers. Logically, Fisher's null implies Neyman's null. It is therefore surprising that, in actual completely randomized experiments, rejection of Neyman's null does not imply rejection of Fisher's null for many realistic situations, including the case with constant causal effect. Furthermore, we show that this paradox also exists in other commonly-used experiments, such as stratified experiments, matched-pair experiments and factorial experiments. Asymptotic analyses, numerical examples and real data examples all support this surprising phenomenon. Besides its historical and theoretical importance, this paradox also leads to useful practical implications for modern researchers.

Journal Article

Share this book

Add to My Shelf

A tutorial on a practical Bayesian alternative to null-hypothesis significance testing

by Masson, Michael E. J. in Alternative approaches , Bayes Theorem , Bayesian analysis

2011

Null-hypothesis significance testing remains the standard inferential tool in cognitive science despite its serious disadvantages. Primary among these is the fact that the resulting probability value does not tell the researcher what he or she usually wants to know: How probable is a hypothesis, given the obtained data? Inspired by developments presented by Wagenmakers ( Psychonomic Bulletin & Review, 14 , 779–804, 2007 ), I provide a tutorial on a Bayesian model selection approach that requires only a simple transformation of sum-of-squares values generated by the standard analysis of variance. This approach generates a graded level of evidence regarding which model (e.g., effect absent [null hypothesis] vs. effect present [alternative hypothesis]) is more strongly supported by the data. This method also obviates admonitions never to speak of accepting the null hypothesis. An Excel worksheet for computing the Bayesian analysis is provided as supplemental material .

Journal Article

Share this book

Add to My Shelf

Moving Towards the Post p < 0.05 Era via the Analysis of Credibility

by Matthews, Robert A. J. in Analysis of credibility , Bayesian inference , Credibility

2019

It is now widely accepted that the techniques of null hypothesis significance testing (NHST) are routinely misused and misinterpreted by researchers seeking insight from data. There is, however, no consensus on acceptable alternatives, leaving researchers with little choice but to continue using NHST, regardless of its failings. I examine the potential for the Analysis of Credibility (AnCred) to resolve this impasse. Using real-life examples, I assess the ability of AnCred to provide researchers with a simple but robust framework for assessing study findings that goes beyond the standard dichotomy of statistical significance/nonsignificance. By extracting more insight from standard summary statistics while offering more protection against inferential fallacies, AnCred may encourage researchers to move toward the post p < 0.05 era.

Journal Article

Share this book

Add to My Shelf

Causal null hypotheses of sustained treatment strategies: What can be tested with an instrumental variable?

by Hernán, Miguel A. , Swanson, Sonja A. , Labrecque, Jeremy in Cardiology , Causality , Epidemiology

2018

Sometimes instrumental variable methods are used to test whether a causal effect is null rather than to estimate the magnitude of a causal effect. However, when instrumental variable methods are applied to time-varying exposures, as in many Mendelian randomization studies, it is unclear what causal null hypothesis is tested. Here, we consider different versions of causal null hypotheses for time-varying exposures, show that the instrumental variable conditions alone are insufficient to test some of them, and describe additional assumptions that can be made to test a wider range of causal null hypotheses, including both sharp and average causal null hypotheses. Implications for interpretation and reporting of instrumental variable results are discussed.

Journal Article

Share this book

Add to My Shelf

Coconut: covariate-assisted composite null hypothesis testing with applications to replicability analysis of high-throughput experimental data

by Zhang, Xin , Li, Yanmei , Ma, Han in Algorithms , Auxiliary information , Bayes Theorem

2025

Background Multiple testing of composite null hypotheses is critical for identifying simultaneous signals across studies. While it is common to incorporate external information in simple null hypotheses, exploiting such auxiliary covariates to provide prior structural relationships among composite null hypotheses and boost the statistical power remains challenging. Results We propose a robust and powerful covariate-assisted composite null hypothesis testing (CoCoNuT) procedure based on a Bayesian framework to identify replicable signals in two studies while asymptotically controlling the false discovery rate. CoCoNuT innovatively adopts a three-dimensional mixture model to consider two primary studies and an integrative auxiliary covariate jointly. While accounting for heterogeneity across studies, the local false discovery rate optimally captures cross-study and cross-feature information, providing improved rankings of feature importance. Conclusions Theoretical and empirical evaluations confirm the validity and efficiency of CoCoNuT. Extensive simulations demonstrate that CoCoNuT outperforms conventional methods that do not exploit auxiliary covariates while controlling the FDR. We apply CoCoNuT to schizophrenia genome-wide association studies, illustrating its higher power in identifying replicable genetic variants with the assistance of relevant auxiliary studies.

Journal Article

Share this book

Add to My Shelf

New-day statistical thinking

by van Witteloostuijn, Arjen in Business and Management , Business Strategy/Leadership , Conventions

2020

In this commentary, I argue why we should stop engaging in null hypothesis statistical significance testing altogether. Artificial and misleading it may be, but we know how to play the p value threshold and null hypothesis-testing game. We feel secure; we love the certainty. The fly in the ointment is that the conventions have led to questionable research practices. Wasserstein, Schirm, & Lazar (Am Stat 73(sup1):1–19, 2019. https://doi.org/10.1080/00031305.2019.1583913) explain why, in their thought-provoking editorial introducing a special issue of The American Statistician: “As ‘statistical significance’ is used less, statistical thinking will be used more.” Perhaps we empirical researchers can together find a way to work ourselves out of the straitjacket that binds us.

Journal Article

Share this book

Add to My Shelf

New Guidelines for Null Hypothesis Significance Testing in Hypothetico-Deductive IS Research

by Recker, Jan , Mertens, Willem in Guidelines , Hypotheses , Information systems

2020

The objective of this research perspectives article is to promote policy change among journals, scholars, and students with a vested interest in hypothetico-deductive information systems (IS) research. We are concerned about the design, analysis, reporting, and reviewing of quantitative IS studies that draw on null hypothesis significance testing (NHST). We observe that although debates about misinterpretations, abuse, and issues with NHST have persisted for about half a century, they remain largely absent in IS. We find this to be an untenable position for a discipline with a proud quantitative tradition. We discuss traditional and emergent threats associated with the application of NHST and examine how they manifest in recent IS scholarship. To encourage the development of new standards for NHST in hypothetico-deductive IS research, we develop a balanced account of possible actions that are implementable in the short-term or long-term and that incentivize or penalize specific practices. To promote an immediate push for change, we also develop two sets of guidelines that IS scholars can adopt immediately.

Journal Article

Share this book

Add to My Shelf

Worked-out examples of the adequacy of Bayesian optional stopping

by Kiers, Henk A. L. , Tendeiro, Jorge N. , van Ravenzwaaij, Don in Bayes Theorem , Bayesian analysis , Behavioral Science and Psychology

2022

The practice of sequentially testing a null hypothesis as data are collected until the null hypothesis is rejected is known as optional stopping . It is well known that optional stopping is problematic in the context of p value-based null hypothesis significance testing: The false-positive rates quickly overcome the single test’s significance level. However, the state of affairs under null hypothesis Bayesian testing, where p values are replaced by Bayes factors, has perhaps surprisingly been much less consensual. Rouder ( 2014 ) used simulations to defend the use of optional stopping under null hypothesis Bayesian testing. The idea behind these simulations is closely related to the idea of sampling from prior predictive distributions. Deng et al. ( 2016 ) and Hendriksen et al. ( 2020 ) have provided mathematical evidence to the effect that optional stopping under null hypothesis Bayesian testing does hold under some conditions. These papers are, however, exceedingly technical for most researchers in the applied social sciences. In this paper, we provide some mathematical derivations concerning Rouder’s approximate simulation results for the two Bayesian hypothesis tests that he considered. The key idea is to consider the probability distribution of the Bayes factor, which is regarded as being a random variable across repeated sampling. This paper therefore offers an intuitive perspective to the literature and we believe it is a valid contribution towards understanding the practice of optional stopping in the context of Bayesian hypothesis testing.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter