Catalogue Search | MBRL

Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations

by Schneider, Jesper W in Bayesian analysis , Bibliometrics , Confusion

2015

Null hypothesis statistical significance tests (NHST) are widely used in quantitative research in the empirical sciences including scientometrics. Nevertheless, since their introduction nearly a century ago significance tests have been controversial. Many researchers are not aware of the numerous criticisms raised against NHST. As practiced, NHST has been characterized as a ‘null ritual’ that is overused and too often misapplied and misinterpreted. NHST is in fact a patchwork of two fundamentally different classical statistical testing models, often blended with some wishful quasi-Bayesian interpretations. This is undoubtedly a major reason why NHST is very often misunderstood. But NHST also has intrinsic logical problems and the epistemic range of the information provided by such tests is much more limited than most researchers recognize. In this article we introduce to the scientometric community the theoretical origins of NHST, which is mostly absent from standard statistical textbooks, and we discuss some of the most prevalent problems relating to the practice of NHST and trace these problems back to the mix-up of the two different theoretical origins. Finally, we illustrate some of the misunderstandings with examples from the scientometric literature and bring forward some modest recommendations for a more sound practice in quantitative data analysis.

Journal Article

Share this book

Add to My Shelf

On the Reproducibility of Psychological Science

by Wang, Tianying , Mandal, Soutrik , Asher, Alex in Applications and Case Studies , Bayes factor , Bias

2017

Investigators from a large consortium of scientists recently performed a multi-year study in which they replicated 100 psychology experiments. Although statistically significant results were reported in 97% of the original studies, statistical significance was achieved in only 36% of the replicated studies. This article presents a reanalysis of these data based on a formal statistical model that accounts for publication bias by treating outcomes from unpublished studies as missing data, while simultaneously estimating the distribution of effect sizes for those studies that tested nonnull effects. The resulting model suggests that more than 90% of tests performed in eligible psychology experiments tested negligible effects, and that publication biases based on p-values caused the observed rates of nonreproducibility. The results of this reanalysis provide a compelling argument for both increasing the threshold required for declaring scientific discoveries and for adopting statistical summaries of evidence that account for the high proportion of tested hypotheses that are false. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

A new Bayesian discrepancy measure

by Bertolino, Francesco , Musio, Monica , Ventura, Laura in Bayesian analysis , Chemistry and Earth Sciences , Computer Science

2024

The aim of this article is to make a contribution to the Bayesian procedure of testing precise hypotheses for parametric models. For this purpose, we define the Bayesian Discrepancy Measure that allows one to evaluate the suitability of a given hypothesis with respect to the available information (prior law and data). To summarise this information, the posterior median is employed, allowing a simple assessment of the discrepancy with a fixed hypothesis. The Bayesian Discrepancy Measure assesses the compatibility of a single hypothesis with the observed data, as opposed to the more common comparative approach where a hypothesis is rejected in favour of a competing hypothesis. The proposed measure of evidence has properties of consistency and invariance. After presenting the definition of the measure for a parameter of interest, both in the absence and in the presence of nuisance parameters, we illustrate some examples showing its conceptual and interpretative simplicity. Finally, we compare a test procedure based on the Bayesian Discrepancy Measure, with the Full Bayesian Significance Test, a well-known Bayesian testing procedure for sharp hypotheses.

Journal Article

Share this book

Add to My Shelf

Bayesian Versus Orthodox Statistics: Which Side Are You On?

by Dienes, Zoltan in Bayesian analysis , Bayesian Statistics , Decision making

2011

Researchers are often confused about what can be inferred from significance tests. One problem occurs when people apply Bayesian intuitions to significance testing—two approaches that must be firmly separated. This article presents some common situations in which the approaches come to different conclusions; you can see where your intuitions initially lie. The situations include multiple testing, deciding when to stop running participants, and when a theory was thought of relative to finding out results. The interpretation of nonsignificant results has also been persistently problematic in a way that Bayesian inference can clarify. The Bayesian and orthodox approaches are placed in the context of different notions of rationality, and I accuse myself and others as having been irrational in the way we have been using statistics on a key notion of rationality. The reader is shown how to apply Bayesian inference in practice, using free online software, to allow more coherent inferences from data.

Journal Article

Share this book

Add to My Shelf

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

by Senn, Stephen J. , Altman, Douglas G. , Goodman, Steven N.

2016

Journal Article

Share this book

Add to My Shelf

Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don't Expect Replication

by Greenland, Sander , Trafimow, David , Amrhein, Valentin in Adopting More Holistic Approaches , Assumptions , Auxiliary hypotheses

2019

Statistical inference often fails to replicate. One reason is that many results may be selected for drawing inference because some threshold of a statistic like the P-value was crossed, leading to biased reported effect sizes. Nonetheless, considerable non-replication is to be expected even without selective reporting, and generalizations from single studies are rarely if ever warranted. Honestly reported results must vary from replication to replication because of varying assumption violations and random variation; excessive agreement itself would suggest deeper problems, such as failure to publish results in conflict with group expectations or desires. A general perception of a \"replication crisis\" may thus reflect failure to recognize that statistical tests not only test hypotheses, but countless assumptions and the entire environment in which research takes place. Because of all the uncertain and unknown assumptions that underpin statistical inferences, we should treat inferential statistics as highly unstable local descriptions of relations between assumptions and data, rather than as providing generalizable inferences about hypotheses or models. And that means we should treat statistical results as being much more incomplete and uncertain than is currently the norm. Acknowledging this uncertainty could help reduce the allure of selective reporting: Since a small P-value could be large in a replication study, and a large P-value could be small, there is simply no need to selectively report studies based on statistical results. Rather than focusing our study reports on uncertain conclusions, we should thus focus on describing accurately how the study was conducted, what problems occurred, what data were obtained, what analysis methods were used and why, and what output those methods produced.

Journal Article

Share this book

Add to My Shelf

Bayesian data analysis for newcomers

by Kruschke, John K. , Liddell, Torrin M. in Bayes Theorem , Bayesian analysis , Behavioral Science and Psychology

2018

This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

Journal Article

Share this book

Add to My Shelf

The Model Confidence Set

by Nason, James M. , Hansen, Peter R. , Lunde, Asger in Applications , Bootstrap-Verfahren , Confidence

2011

This paper introduces the model confidence set (MCS) and applies it to the selection of models. A MCS is a set of models that is constructed such that it will contain the best model with a given level of confidence. The MCS is in this sense analogous to a confidence interval for a parameter. The MCS acknowledges the limitations of the data, such that uninformative data yield a MCS with many models, whereas informative data yield a MCS with only a few models. The MCS procedure does not assume that a particular model is the true model; in fact, the MCS procedure can be used to compare more general objects, beyond the comparison of models. We apply the MCS procedure to two empirical problems. First, we revisit the inflation forecasting problem posed by Stock and Watson (1999), and compute the MCS for their set of inflation forecasts. Second, we compare a number of Taylor rule regressions and determine the MCS of the best regression in terms of in-sample likelihood criteria.

Journal Article

Share this book

Add to My Shelf

Machine Learning Based Intrusion Detection Systems for IoT Applications

by Verma, Abhishek , Ranga, Virender in Accuracy , Algorithms , Blacklisting

2020

Internet of Things (IoT) and its applications are the most popular research areas at present. The characteristics of IoT on one side make it easily applicable to real-life applications, whereas on the other side expose it to cyber threats. Denial of Service (DoS) is one of the most catastrophic attacks against IoT. In this paper, we investigate the prospects of using machine learning classification algorithms for securing IoT against DoS attacks. A comprehensive study is carried on the classifiers which can advance the development of anomaly-based intrusion detection systems (IDSs). Performance assessment of classifiers is done in terms of prominent metrics and validation methods. Popular datasets CIDDS-001, UNSW-NB15, and NSL-KDD are used for benchmarking classifiers. Friedman and Nemenyi tests are employed to analyze the significant differences among classifiers statistically. In addition, Raspberry Pi is used to evaluate the response time of classifiers on IoT specific hardware. We also discuss a methodology for selecting the best classifier as per application requirements. The main goals of this study are to motivate IoT security researchers for developing IDSs using ensemble learning, and suggesting appropriate methods for statistical assessment of classifier’s performance.

Journal Article

Share this book

Add to My Shelf

A SIGNIFICANCE TEST FOR THE LASSO

by Tibshirani, Robert , Taylor, Jonathan , Tibshirani, Ryan J. in 62F03 , 62J05 , 62J07

2014

In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p>n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a $\\chi _1^2$ distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than $\\chi _1^2$ under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the l₁ penalty. Therefore, the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties—adaptivity and shrinkage—and its null distribution is tractable and asymptotically Exp(1).

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter