Catalogue Search | MBRL

The State of Applied Econometrics: Causality and Policy Evaluation

by Athey, Susan , Imbens, Guido W. in Analytical estimating , Causality , Discontinuity

2017

In this paper, we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, in each case, highlighting recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses, including placebo analyses as well as sensitivity and robustness analyses, intended to make the identification strategies more credible. Third, we discuss some implications of recent advances in machine learning methods for causal effects, including methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogenous treatment effects.

Journal Article

Share this book

Add to My Shelf

Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies

by Ma, K. H. , Hsieh, T. C. , Gotelli, Nicholas J. in abundance data , Analytical estimating , Araneae

2014

Quantifying and assessing changes in biological diversity are central aspects of many ecological studies, yet accurate methods of estimating biological diversity from sampling data have been elusive. Hill numbers, or the effective number of species, are increasingly used to characterize the taxonomic, phylogenetic, or functional diversity of an assemblage. However, empirical estimates of Hill numbers, including species richness, tend to be an increasing function of sampling effort and, thus, tend to increase with sample completeness. Integrated curves based on sampling theory that smoothly link rarefaction (interpolation) and prediction (extrapolation) standardize samples on the basis of sample size or sample completeness and facilitate the comparison of biodiversity data. Here we extended previous rarefaction and extrapolation models for species richness (Hill number q D , where q = 0) to measures of taxon diversity incorporating relative abundance (i.e., for any Hill number q D , q > 0) and present a unified approach for both individual-based (abundance) data and sample-based (incidence) data. Using this unified sampling framework, we derive both theoretical formulas and analytic estimators for seamless rarefaction and extrapolation based on Hill numbers. Detailed examples are provided for the first three Hill numbers: q = 0 (species richness), q = 1 (the exponential of Shannon's entropy index), and q = 2 (the inverse of Simpson's concentration index). We developed a bootstrap method for constructing confidence intervals around Hill numbers, facilitating the comparison of multiple assemblages of both rarefied and extrapolated samples. The proposed estimators are accurate for both rarefaction and short-range extrapolation. For long-range extrapolation, the performance of the estimators depends on both the value of q and on the extrapolation range. We tested our methods on simulated data generated from species abundance models and on data from large species inventories. We also illustrate the formulas and estimators using empirical data sets from biodiversity surveys of temperate forest spiders and tropical ants.

Journal Article

Share this book

Add to My Shelf

Structural Topic Models for Open-Ended Survey Responses

by Roberts, Margaret E. , Gadarian, Shana Kushner , Stewart, Brandon M. in Academic disciplines , AJPS WORKSHOP , Alternative approaches

2014

Collection and especially analysis of open-ended survey responses are relatively rare in the discipline and when conducted are almost exclusively done through human coding. We present an alternative, semiautomated approach, the structural topic model (STM) (Roberts, Stewart, and Airoldi 2013; Roberts et al. 2013), that draws on recent developments in machine learning based analysis of textual data. A crucial contribution of the method is that it incorporates information about the document, such as the authors gender, political affiliation, and treatment assignment (if an experimental study). This article focuses on how the STM is helpful for survey researchers and experimentalists. The STM makes analyzing open-ended responses easier, more revealing, and capable of being used to estimate treatment effects. We illustrate these innovations with analysis of text from surveys and experiments.

Journal Article

Share this book

Add to My Shelf

A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices

by CORWIN, SHANE A. , SCHULTZ, PAUL in Analytical estimating , Asked price , Cost estimates

2012

We develop a bid-ask spread estimator from daily high and low prices. Daily high (low) prices are almost always buy (sell) trades. Hence, the high-low ratio reflects both the stock's variance and its bid-ask spread. Although the variance component of the high-low ratio is proportional to the return interval, the spread component is not. This allows us to derive a spread estimator as a function of high-low ratios over 1-day and 2-day intervals. The estimator is easy to calculate, can be applied in a variety of research areas, and generally outperforms other low-frequency estimators.

Journal Article

Share this book

Add to My Shelf

Common Errors: How to (and Not to) Control for Unobserved Heterogeneity

by Gormley, Todd A. , Matsa, David A. in Analytical estimating , Asset pricing , Computational methods

2014

Controlling for unobserved heterogeneity (or \"common errors\"), such as industry-specific shocks, is a fundamental challenge in empirical research. This paper discusses the limitations of two approaches widely used in corporate finance and asset pricing research: demeaning the dependent variable with respect to the group (e.g., \"industry-adjusting\") and adding the mean of the group's dependent variable as a control. We show that these methods produce inconsistent estimates and can distort inference. In contrast, the fixed effects estimator is consistent and should be used instead. We also explain how to estimate the fixed effects model when traditional methods are computationally infeasible.

Journal Article

Share this book

Add to My Shelf

Doubly Robust Policy Evaluation and Optimization

by Erhan, Dumitru , Dudík, Miroslav , Li, Lihong in Analytical estimating , Bias , causal inference

2014

We study sequential decision making in environments where rewards are only partially observed, but can be modeled as a function of observed contexts and the chosen action by the decision maker. This setting, known as contextual bandits, encompasses a wide variety of applications such as health care, content recommendation and Internet advertising. A central task is evaluation of a new policy given historic data consisting of contexts, actions and received rewards. The key challenge is that the past data typically does not faithfully represent proportions of actions taken by a new policy. Previous approaches rely either on models of rewards or models of the past policy. The former are plagued by a large bias whereas the latter have a large variance. In this work, we leverage the strengths and overcome the weaknesses of the two approaches by applying the doubly robust estimation technique to the problems of policy evaluation and optimization. We prove that this approach yields accurate value estimates when we have either a good (but not necessarily consistent) model of rewards or a good (but not necessarily consistent) model of past policy. Extensive empirical comparison demonstrates that the doubly robust estimation uniformly improves over existing techniques, achieving both lower variance in value estimation and better policies. As such, we expect the doubly robust approach to become common practice in policy evaluation and optimization.

Journal Article

Share this book

Add to My Shelf

Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection

by Kuo, Lynn , Chen, Ming-Hui , Xie, Wangang in Analytical estimating , Bayes Theorem , Bayesian analysis

2011

The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenetics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be easily computed from the output of a Markov chain Monte Carlo analysis but often greatly overestimates the marginal likelihood. The thermodynamic integration (TI) method is much more accurate than the HM method but requires more computation. In this paper, we introduce a new method, stepping stone sampling (SS), which uses importance sampling to estimate each ratio in a series (the \"stepping stones\") bridging the posterior and prior distributions. We compare the performance of the SS approach to the TI and HM methods in simulation and using real data. We conclude that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed.

Journal Article

Share this book

Add to My Shelf

Total-Evidence Dating under the Fossilized Birth—Death Process

by Zhang, Chi , Klopfstein, Seraina , Heath, Tracy A. in Analytical estimating , Animal age determination , Animals

2016

Bayesian total-evidence dating involves the simultaneous analysis of morphological data from the fossil record and morphological and sequence data from recent organisms, and it accommodates the uncertainty in the placement of fossils while dating the phylogenetic tree. Due to the flexibility of the Bayesian approach, total-evidence dating can also incorporate additional sources of information. Here, we take advantage of this and expand the analysis to include information about fossilization and sampling processes. Our work is based on the recently described fossilized birth-death (FBD) process, which has been used to model speciation, extinction, and fossilization rates that can vary over time in a piecewise manner. So far, sampling of extant and fossil taxa has been assumed to be either complete or uniformly at random, an assumption which is only valid for a minority of data sets. We therefore extend the FBD process to accommodate diversified sampling of extant taxa, which is standard practice in studies of higher-level taxa. We verify the implementation using simulations and apply it to the early radiation of Hymenoptera (wasps, ants, and bees). Previous total-evidence dating analyses of this data set were based on a simple uniform tree prior and dated the initial radiation of extant Hymenoptera to the late Carboniferous (309 Ma). The analyses using the FBD prior under diversified sampling, however, date the radiation to the Triassic and Permian (252 Ma), slightly older than the age of the oldest hymenopteran fossils. By exploring a variety of FBD model assumptions, we show that it is mainly the accommodation of diversified sampling that causes the push toward more recent divergence times. Accounting for diversified sampling thus has the potential to close the long-discussed gap between rocks and clocks. We conclude that the explicit modeling of fossilization and sampling processes can improve divergence time estimates, but only if all important model aspects, including sampling biases, are adequately addressed.

Journal Article

Share this book

Add to My Shelf

Causal Inference in Conjoint Analysis: Understanding Multidimensional Choices via Stated Preference Experiments

by Yamamoto, Teppei , Hainmueller, Jens , Hopkins, Daniel J. in Analysis , Analytical estimating , Attitudes

2014

Survey experiments are a core tool for causal inference. Yet, the design of classical survey experiments prevents them from identifying which components of a multidimensional treatment are influential. Here, we show how conjoint analysis, an experimental design yet to be widely applied in political science, enables researchers to estimate the causal effects of multiple treatment components and assess several causal hypotheses simultaneously. In conjoint analysis, respondents score a set of alternatives, where each has randomly varied attributes. Here, we undertake a formal identification analysis to integrate conjoint analysis with the potential outcomes framework for causal inference. We propose a new causal estimand and show that it can be nonparametrically identified and easily estimated from conjoint data using a fully randomized design. The analysis enables us to propose diagnostic checks for the identification assumptions. We then demonstrate the value of these techniques through empirical applications to voter decision making and attitudes toward immigrants.

Journal Article

Share this book

Add to My Shelf

Least squares after model selection in high-dimensional sparse models

by BELLONI, ALEXANDRE , CHERNOZHUKOV, VICTOR in Analytical estimating , Approximation , Eigenvalues

2013

In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection \"fails\" in the sense of missing some components of the \"true\" regression model. By the \"true\" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the Lasso-based model selection correctly includes all components of the \"true\" model as a subset and also achieves sufficient sparsity. In the extreme case, when Lasso perfectly selects the \"true\" model, the OLS post-Lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by Lasso, which guarantees that this dimension is at most of the same order as the dimension of the \"true\" model. Our rate results are nonasymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the Lasso estimator acting as a selector in the first step, but also applies to any other estimator, for example, various forms of thresholded Lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces additional sparsity subject to maintaining a certain goodness of fit. The latter scheme has theoretical guarantees similar to those of Lasso or OLS post-Lasso, but it dominates those procedures as well as traditional thresholding in a wide variety of experiments.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter