Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
846 result(s) for "C12"
Sort by:
CHANNELING FISHER
I follow R. A. Fisher’s The Design of Experiments (1935), using randomization statistical inference to test the null hypothesis of no treatment effects in a comprehensive sample of 53 experimental papers drawn from the journals of the American Economic Association. In the average paper, randomization tests of the significance of individual treatment effects find 13% to 22% fewer significant results than are found using authors’ methods. In joint tests of multiple treatment effects appearing together in tables, randomization tests yield 33% to 49% fewer statistically significant results than conventional tests. Bootstrap and jackknife methods support and confirm the randomization results.
False (and Missed) Discoveries in Financial Economics
Multiple testing plagues many important questions in finance such as fund and factor selection. We propose a new way to calibrate both Type I and Type II errors. Next, using a double-bootstrap method, we establish a t-statistic hurdle that is associated with a specific false discovery rate (e.g., 5%). We also establish a hurdle that is associated with a certain acceptable ratio of misses to false discoveries (Type II error scaled by Type I error), which effectively allows for differential costs of the two types of mistakes. Evaluating current methods, we find that they lack power to detect outperforming managers.
Methods Matter
The credibility revolution in economics has promoted causal identification using randomized control trials (RCT), difference-in-differences (DID), instrumental variables (IV) and regression discontinuity design (RDD). Applying multiple approaches to over 21,000 hypothesis tests published in 25 leading economics journals, we find that the extent of p-hacking and publication bias varies greatly by method. IV (and to a lesser extent DID) are particularly problematic. We find no evidence that (i) papers published in the Top 5 journals are different to others; (ii) the journal “revise and resubmit” process mitigates the problem; (iii) things are improving through time.
Confidence intervals for policy evaluation in adaptive experiments
Adaptive experimental designs can dramatically improve efficiency in randomized trials. But with adaptively collected data, common estimators based on sample means and inverse propensity-weighted means can be biased or heavy-tailed. This poses statistical challenges, in particular when the experimenter would like to test hypotheses about parameters that were not targeted by the data-collection mechanism. In this paper, we present a class of test statistics that can handle these challenges. Our approach is to adaptively reweight the terms of an augmented inverse propensity-weighting estimator to control the contribution of each term to the estimator’s variance. This scheme reduces overall variance and yields an asymptotically normal test statistic. We validate the accuracy of the resulting estimates and their CIs in numerical experiments and show that our methods compare favorably to existing alternatives in terms of mean squared error, coverage, and CI size.
Statistical Significance, p -Values, and the Reporting of Uncertainty
The use of statistical significance and p-values has become a matter of substantial controversy in various fields using statistical methods. This has gone as far as some journals banning the use of indicators for statistical significance, or even any reports of p-values, and, in one case, any mention of confidence intervals. I discuss three of the issues that have led to these often-heated debates. First, I argue that in many cases, p-values and indicators of statistical significance do not answer the questions of primary interest. Such questions typically involve making (recommendations on) decisions under uncertainty. In that case, point estimates and measures of uncertainty in the form of confidence intervals or even better, Bayesian intervals, are often more informative summary statistics. In fact, in that case, the presence or absence of statistical significance is essentially irrelevant, and including them in the discussion may confuse the matter at hand. Second, I argue that there are also cases where testing null hypotheses is a natural goal and where p-values are reasonable and appropriate summary statistics. I conclude that banning them in general is counterproductive. Third, I discuss that the overemphasis in empirical work on statistical significance has led to abuse of p-values in the form of p-hacking and publication bias. The use of pre-analysis plans and replication studies, in combination with lowering the emphasis on statistical significance may help address these problems.
Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures
Abstract-Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test. In modern large-scale data analysis, correlation and sparsity are common features and efficient computation is a necessary requirement for dealing with massive data. To overcome these challenges, we propose a new test that takes advantage of the Cauchy distribution. Our test statistic has a simple form and is defined as a weighted sum of Cauchy transformation of individual p-values. We prove a nonasymptotic result that the tail of the null distribution of our proposed test statistic can be well approximated by a Cauchy distribution under arbitrary dependency structures. Based on this theoretical result, the p-value calculation of our proposed test is not only accurate, but also as simple as the classic z-test or t-test, making our test well suited for analyzing massive data. We further show that the power of the proposed test is asymptotically optimal in a strong sparsity setting. Extensive simulations demonstrate that the proposed test has both strong power against sparse alternatives and a good accuracy with respect to p-value calculations, especially for very small p-values. The proposed test has also been applied to a genome-wide association study of Crohn's disease and compared with several existing tests. Supplementary materials for this article are available online.
The Effect of Social Density on Word of Mouth
This research investigates whether a contextual factor—social density, defined as the number of people in a given area—influences consumers’ propensity to share information. We propose that high- (vs. low-) density settings make consumers experience a loss of perceived control, which in turn makes them more likely to engage in word of mouth to restore it. Six studies, conducted online as well as in laboratory and naturalistic settings, provide support for this hypothesis. We demonstrate that social density increases the likelihood of sharing information with others and that a person’s chronic need for control moderates this effect. Consistent with the proposed process, the effect of social density on information sharing is attenuated when participants have the opportunity to restore control before they engage in word of mouth. We also provide evidence that sharing information restores perceived control in high-density environments, and we disentangle the effect of social density from that of physical proximity.
WHEN IS PARALLEL TRENDS SENSITIVE TO FUNCTIONAL FORM?
This paper assesses when the validity of difference-in-differences depends on functional form. We provide a novel characterization: the parallel trends assumption holds under all strictly monotonic transformations of the outcome if and only if a stronger “parallel trends”-type condition holds for the cumulative distribution function of untreated potential outcomes. This condition for parallel trends to be insensitive to functional form is satisfied if and essentially only if the population can be partitioned into a subgroup for which treatment is effectively randomly assigned and a remaining sub-group for which the distribution of untreated potential outcomes is stable over time. These conditions have testable implications, and we introduce falsification tests for the null that parallel trends is insensitive to functional form.
Inference for subvectors and other functions of partially identified parameters in moment inequality models
This paper introduces a bootstrap-based inference method for functions of the parameter vector in a moment (in)equality model. These functions are restricted to be linear for two-sided testing problems, but may be nonlinear for one-sided testing problems. In the most common case, this function selects a subvector of the parameter, such as a single component. The new inference method we propose controls asymptotic size uniformly over a large class of data distributions and improves upon the two existing methods that deliver uniform size control for this type of problem: projection-based and subsampling inference. Relative to projection-based procedures, our method presents three advantages: (i) it weakly dominates in terms of finite sample power, (ii) it strictly dominates in terms of asymptotic power, and (iii) it is typically less computationally demanding. Relative to subsampling, our method presents two advantages: (i) it strictly dominates in terms of asymptotic power (for reasonable choices of subsample size), and (ii) it appears to be less sensitive to the choice of its tuning parameter than subsampling is to the choice of subsample size.
Physics potentials with the second Hyper-Kamiokande detector in Korea
Hyper-Kamiokande consists of two identical water-Cherenkov detectors of total 520 kt, with the first one in Japan at 295 km from the J-PARC neutrino beam with 2.5$^\\circ$ off-axis angles (OAAs), and the second one possibly in Korea at a later stage. Having the second detector in Korea would benefit almost all areas of neutrino oscillation physics, mainly due to longer baselines. There are several candidate sites in Korea with baselines of 1000–1300 km and OAAs of 1$^\\circ$–3$^\\circ$. We conducted sensitivity studies on neutrino oscillation physics for a second detector, either in Japan (JD $\\times$ 2) or Korea (JD + KD), and compared the results with a single detector in Japan. Leptonic charge–parity (CP) symmetry violation sensitivity is improved, especially when the CP is non-maximally violated. The larger matter effect at Korean candidate sites significantly enhances sensitivities to non-standard interactions of neutrinos and mass ordering determination. Current studies indicate the best sensitivity is obtained at Mt. Bisul (1088 km baseline, $1.3^\\circ$ OAA). Thanks to a larger (1000 m) overburden than the first detector site, clear improvements to sensitivities for solar and supernova relic neutrino searches are expected.