Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
102
result(s) for
"Mayo, Deborah G."
Sort by:
On the Birnbaum Argument for the Strong Likelihood Principle
2014
An essential component of inference based on familiar frequentisi notions, such as p-values, significance and confidence levels, is the relevant sampling distribution. This feature results in violations of a principle known as the strong likelihood principle (SLP), the focus of this paper. In particular, if outcomes x* and y* from experiments E₁ and E₂ (both with unknown parameter θ) have different probability models f₁ (·) , f₂(·), then even though f₁(x*; θ) = cf₂(y*; θ) for all θ, outcomes x* and y* may have different implications for an inference about θ. Although such violations stem from considering outcomes other than the one observed, we argue this does not require us to consider experiments other than the one performed to produce the data. David Cox [Ann. Math. Statist. 29 (1958) 357-372] proposes the Weak Conditionality Principle (WCP) to justify restricting the space of relevant repetitions. The WCP says that once it is known which Ei produced the measurement, the assessment should be in terms of the properties of Ei. The surprising upshot of Allan Birnbaum's [J. Amer. Statist. Assoc. 57 (1962) 269-306] argument is that the SLP appears to follow from applying the WCP in the case of mixtures, and so uncontroversial a principle as sufficiency (SP). But this would preclude the use of sampling distributions. The goal of this article is to provide a new clarification and critique of Birnbaum's argument. Although his argument purports that [(WCP and SP) entails SLP], we show how data may violate the SLP while holding both the WCP and SP. Such cases also refute [WCP entails SLP].
Journal Article
Statistical significance and its critics: practicing damaging science, or damaging scientific practice?
2022
While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science. We argue that banning the use of p-value thresholds in interpreting data does not diminish but rather exacerbates data-dredging and biasing selection effects. If an account cannot specify outcomes that will not be allowed to count as evidence for a claim—if all thresholds are abandoned—then there is no test of that claim. The contributions of this paper are: To explain the rival statistical philosophies underlying the ongoing controversy; To elucidate and reinterpret statistical significance tests, and explain how this reinterpretation ameliorates common misuses and misinterpretations; To argue why recent recommendations to replace, abandon, or retire statistical significance undermine a central function of statistics in science: to test whether observed patterns in the data are genuine or due to background variability.
Journal Article
Methodology in Practice: Statistical Misspecification Testing
2004
The growing availability of computer power and statistical software has greatly increased the ease with which practitioners apply statistical methods, but this has not been accompanied by attention to checking the assumptions on which these methods are based. At the same time, disagreements about inferences based on statistical research frequently revolve around whether the assumptions are actually met in the studies available, e.g., in psychology, ecology, biology, risk assessment. Philosophical scrutiny can help disentangle ‘practical’ problems of model validation, and conversely, a methodology of statistical model validation can shed light on a number of issues of interest to philosophers of science.
Journal Article
Severe Testing as a Basic Concept in a Neyman–Pearson Philosophy of Induction
2006
Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and long-standing problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test's (pre-data) error probabilities are to be used for (post-data) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities is to ensure that only statistical hypotheses that have passed severe or probative tests are inferred from the data. The severity criterion supplies a meta-statistical principle for evaluating proposed statistical inferences, avoiding classic fallacies from tests that are overly sensitive, as well as those not sensitive enough to particular errors and discrepancies. Introduction and overview 1.1 Behavioristic and inferential rationales for Neyman–Pearson (N–P) tests1.2 Severity rationale: induction as severe testing1.3 Severity as a meta-statistical concept: three required restrictions on the N–P paradigmError statistical tests from the severity perspective 2.1 N–P test T(α): type I, II error probabilities and power2.2 Specifying test T(α) using p-valuesNeyman's post-data use of power 3.1 Neyman: does failure to reject H warrant confirming H?Severe testing as a basic concept for an adequate post-data inference 4.1 The severity interpretation of acceptance (SIA) for test T(α)4.2 The fallacy of acceptance (i.e., an insignificant difference): Ms Rosy4.3 Severity and powerFallacy of rejection: statistical vs. substantive significance 5.1 Taking a rejection of H0 as evidence for a substantive claim or theory5.2 A statistically significant difference from H0 may fail to indicate a substantively important magnitude5.3 Principle for the severity interpretation of a rejection (SIR)5.4 Comparing significant results with different sample sizes in T(α): large n problem5.5 General testing rules for T(α), using the severe testing conceptThe severe testing concept and confidence intervals 6.1 Dualities between one and two-sided intervals and tests6.2 Avoiding shortcomings of confidence intervalsBeyond the N–P paradigm: pure significance, and misspecification testsConcluding comments: have we shown severity to be a basic concept in a N–P philosophy of induction?
Journal Article
Error statistical modeling and inference: Where methodology meets ontology
2015
In empirical modeling, an important desiderata for deeming theoretical entities and processes as real is that they can be reproducible in a statistical sense. Current day crises regarding replicability in science intertwines with the question of how statistical methods link data to statistical and substantive theories and models. Different answers to this question have important methodological consequences for inference, which are intertwined with a contrast between the ontological commitments of the two types of models. The key to untangling them is the realization that behind every substantive model there is a statistical model that pertains exclusively to the probabilistic assumptions imposed on the data. It is not that the methodology determines whether to be a realist about entities and processes in a substantive field. It is rather that the substantive and statistical models refer to different entities and processes, and therefore call for different criteria of adequacy.
Journal Article
Error and the growth of experimental knowledge
1996
We may learn from our mistakes, but Deborah Mayo argues that, where experimental knowledge is concerned, we haven't begun to learn enough. Error and the Growth of Experimental Knowledge launches a vigorous critique of the subjective Bayesian view of statistical inference, and proposes Mayo's own error-statistical approach as a more robust framework for the epistemology of experiment. Mayo genuinely addresses the needs of researchers who work with statistical analysis, and simultaneously engages the basic philosophical problems of objectivity and rationality. Mayo has long argued for an account of learning from error that goes far beyond detecting logical inconsistencies. In this book, she presents her complete program for how we learn about the world by being \"shrewd inquisitors of error, white gloves off.\" Her tough, practical approach will be important to philosophers, historians, and sociologists of science, and will be welcomed by researchers in the physical, biological, and social sciences whose work depends upon statistical analysis.
Philosophical Scrutiny of Evidence of Risks: From Bioethics to Bioevidence
2006
We argue that a responsible analysis of today’s evidence‐based risk assessments and risk debates in biology demands a critical or metascientific scrutiny of the uncertainties, assumptions, and threats of error along the manifold steps in risk analysis. Without an accompanying methodological critique, neither sensitivity to social and ethical values, nor conceptual clarification alone, suffices. In this view, restricting the invitation for philosophical involvement to those wearing a “bioethicist” label precludes the vitally important role philosophers of science may be able to play asbioevidentialists. The goal of this paper is to give a brief and partial sketch of how a metascientific scrutiny of risk evidence might work.
Journal Article