Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
846
result(s) for
"C12"
Sort by:
CHANNELING FISHER
2019
I follow R. A. Fisher’s The Design of Experiments (1935), using randomization statistical inference to test the null hypothesis of no treatment effects in a comprehensive sample of 53 experimental papers drawn from the journals of the American Economic Association. In the average paper, randomization tests of the significance of individual treatment effects find 13% to 22% fewer significant results than are found using authors’ methods. In joint tests of multiple treatment effects appearing together in tables, randomization tests yield 33% to 49% fewer statistically significant results than conventional tests. Bootstrap and jackknife methods support and confirm the randomization results.
Journal Article
False (and Missed) Discoveries in Financial Economics
2020
Multiple testing plagues many important questions in finance such as fund and factor selection. We propose a new way to calibrate both Type I and Type II errors. Next, using a double-bootstrap method, we establish a t-statistic hurdle that is associated with a specific false discovery rate (e.g., 5%). We also establish a hurdle that is associated with a certain acceptable ratio of misses to false discoveries (Type II error scaled by Type I error), which effectively allows for differential costs of the two types of mistakes. Evaluating current methods, we find that they lack power to detect outperforming managers.
Journal Article
Methods Matter
2020
The credibility revolution in economics has promoted causal identification using randomized control trials (RCT), difference-in-differences (DID), instrumental variables (IV) and regression discontinuity design (RDD). Applying multiple approaches to over 21,000 hypothesis tests published in 25 leading economics journals, we find that the extent of p-hacking and publication bias varies greatly by method. IV (and to a lesser extent DID) are particularly problematic. We find no evidence that (i) papers published in the Top 5 journals are different to others; (ii) the journal “revise and resubmit” process mitigates the problem; (iii) things are improving through time.
Journal Article
Confidence intervals for policy evaluation in adaptive experiments
by
Wager, Stefan
,
Hirshberg, David A.
,
Athey, Susan
in
Algorithms
,
Confidence intervals
,
Data collection
2021
Adaptive experimental designs can dramatically improve efficiency in randomized trials. But with adaptively collected data, common estimators based on sample means and inverse propensity-weighted means can be biased or heavy-tailed. This poses statistical challenges, in particular when the experimenter would like to test hypotheses about parameters that were not targeted by the data-collection mechanism. In this paper, we present a class of test statistics that can handle these challenges. Our approach is to adaptively reweight the terms of an augmented inverse propensity-weighting estimator to control the contribution of each term to the estimator’s variance. This scheme reduces overall variance and yields an asymptotically normal test statistic. We validate the accuracy of the resulting estimates and their CIs in numerical experiments and show that our methods compare favorably to existing alternatives in terms of mean squared error, coverage, and CI size.
Journal Article
Statistical Significance, p -Values, and the Reporting of Uncertainty
2021
The use of statistical significance and p-values has become a matter of substantial controversy in various fields using statistical methods. This has gone as far as some journals banning the use of indicators for statistical significance, or even any reports of p-values, and, in one case, any mention of confidence intervals. I discuss three of the issues that have led to these often-heated debates. First, I argue that in many cases, p-values and indicators of statistical significance do not answer the questions of primary interest. Such questions typically involve making (recommendations on) decisions under uncertainty. In that case, point estimates and measures of uncertainty in the form of confidence intervals or even better, Bayesian intervals, are often more informative summary statistics. In fact, in that case, the presence or absence of statistical significance is essentially irrelevant, and including them in the discussion may confuse the matter at hand. Second, I argue that there are also cases where testing null hypotheses is a natural goal and where p-values are reasonable and appropriate summary statistics. I conclude that banning them in general is counterproductive. Third, I discuss that the overemphasis in empirical work on statistical significance has led to abuse of p-values in the form of p-hacking and publication bias. The use of pre-analysis plans and replication studies, in combination with lowering the emphasis on statistical significance may help address these problems.
Journal Article
Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures
2020
Abstract-Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test. In modern large-scale data analysis, correlation and sparsity are common features and efficient computation is a necessary requirement for dealing with massive data. To overcome these challenges, we propose a new test that takes advantage of the Cauchy distribution. Our test statistic has a simple form and is defined as a weighted sum of Cauchy transformation of individual p-values. We prove a nonasymptotic result that the tail of the null distribution of our proposed test statistic can be well approximated by a Cauchy distribution under arbitrary dependency structures. Based on this theoretical result, the p-value calculation of our proposed test is not only accurate, but also as simple as the classic z-test or t-test, making our test well suited for analyzing massive data. We further show that the power of the proposed test is asymptotically optimal in a strong sparsity setting. Extensive simulations demonstrate that the proposed test has both strong power against sparse alternatives and a good accuracy with respect to p-value calculations, especially for very small p-values. The proposed test has also been applied to a genome-wide association study of Crohn's disease and compared with several existing tests.
Supplementary materials
for this article are available online.
Journal Article
The Effect of Social Density on Word of Mouth
2018
This research investigates whether a contextual factor—social density, defined as the number of people in a given area—influences consumers’ propensity to share information. We propose that high- (vs. low-) density settings make consumers experience a loss of perceived control, which in turn makes them more likely to engage in word of mouth to restore it. Six studies, conducted online as well as in laboratory and naturalistic settings, provide support for this hypothesis. We demonstrate that social density increases the likelihood of sharing information with others and that a person’s chronic need for control moderates this effect. Consistent with the proposed process, the effect of social density on information sharing is attenuated when participants have the opportunity to restore control before they engage in word of mouth. We also provide evidence that sharing information restores perceived control in high-density environments, and we disentangle the effect of social density from that of physical proximity.
Journal Article
WHEN IS PARALLEL TRENDS SENSITIVE TO FUNCTIONAL FORM?
by
Sant’Anna, Pedro H. C.
,
Roth, Jonathan
in
Difference‐in‐differences
,
Falsification
,
functional form
2023
This paper assesses when the validity of difference-in-differences depends on functional form. We provide a novel characterization: the parallel trends assumption holds under all strictly monotonic transformations of the outcome if and only if a stronger “parallel trends”-type condition holds for the cumulative distribution function of untreated potential outcomes. This condition for parallel trends to be insensitive to functional form is satisfied if and essentially only if the population can be partitioned into a subgroup for which treatment is effectively randomly assigned and a remaining sub-group for which the distribution of untreated potential outcomes is stable over time. These conditions have testable implications, and we introduce falsification tests for the null that parallel trends is insensitive to functional form.
Journal Article
Inference for subvectors and other functions of partially identified parameters in moment inequality models
by
Shi, Xiaoxia
,
Bugni, Federico A
,
Canay, Ivan A
in
Confidence intervals
,
Econometrics
,
Equality
2017
This paper introduces a bootstrap-based inference method for functions of the parameter vector in a moment (in)equality model. These functions are restricted to be linear for two-sided testing problems, but may be nonlinear for one-sided testing problems. In the most common case, this function selects a subvector of the parameter, such as a single component. The new inference method we propose controls asymptotic size uniformly over a large class of data distributions and improves upon the two existing methods that deliver uniform size control for this type of problem: projection-based and subsampling inference. Relative to projection-based procedures, our method presents three advantages: (i) it weakly dominates in terms of finite sample power, (ii) it strictly dominates in terms of asymptotic power, and (iii) it is typically less computationally demanding. Relative to subsampling, our method presents two advantages: (i) it strictly dominates in terms of asymptotic power (for reasonable choices of subsample size), and (ii) it appears to be less sensitive to the choice of its tuning parameter than subsampling is to the choice of subsample size.
Journal Article
Physics potentials with the second Hyper-Kamiokande detector in Korea
2018
Hyper-Kamiokande consists of two identical water-Cherenkov detectors of total 520 kt, with the first one in Japan at 295 km from the J-PARC neutrino beam with 2.5$^\\circ$ off-axis angles (OAAs), and the second one possibly in Korea at a later stage. Having the second detector in Korea would benefit almost all areas of neutrino oscillation physics, mainly due to longer baselines. There are several candidate sites in Korea with baselines of 1000–1300 km and OAAs of 1$^\\circ$–3$^\\circ$. We conducted sensitivity studies on neutrino oscillation physics for a second detector, either in Japan (JD $\\times$ 2) or Korea (JD + KD), and compared the results with a single detector in Japan. Leptonic charge–parity (CP) symmetry violation sensitivity is improved, especially when the CP is non-maximally violated. The larger matter effect at Korean candidate sites significantly enhances sensitivities to non-standard interactions of neutrinos and mass ordering determination. Current studies indicate the best sensitivity is obtained at Mt. Bisul (1088 km baseline, $1.3^\\circ$ OAA). Thanks to a larger (1000 m) overburden than the first detector site, clear improvements to sensitivities for solar and supernova relic neutrino searches are expected.
Journal Article