Catalogue Search | MBRL

Causal inference and the data-fusion problem

by Pearl, Judea , Bareinboim, Elias in COLLOQUIUM PAPER , Computer Sciences , Physical Sciences

2016

We review concepts, principles, and tools that unify current approaches to causal analysis and attend to new challenges presented by big data. In particular, we address the problem of data fusion—piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) to obtain valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, because the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. We here present a general, nonparametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data fusion in causal inference tasks.

Journal Article

Share this book

Add to My Shelf

Recursive partitioning for heterogeneous causal effects

by Athey, Susan , Imbens, Guido in COLLOQUIUM PAPER , Computer Simulation , Machine Learning

2016

In this paper we propose methods for estimating heterogeneity in causal effects in experimental and observational studies and for conducting hypothesis tests about the magnitude of differences in treatment effects across subsets of the population. We provide a data-driven approach to partition the data into subpopulations that differ in the magnitude of their treatment effects. The approach enables the construction of valid confidence intervals for treatment effects, even with many covariates relative to the sample size, and without “sparsity” assumptions.We propose an “honest” approach to estimation, whereby one sample is used to construct the partition and another to estimate treatment effects for each subpopulation. Our approach builds on regression tree methods, modified to optimize for goodness of fit in treatment effects and to account for honest estimation. Our model selection criterion anticipates that bias will be eliminated by honest estimation and also accounts for the effect of making additional splits on the variance of treatment effect estimates within each subpopulation. We address the challenge that the “ground truth” for a causal effect is not observed for any individual unit, so that standard approaches to cross-validation must be modified. Through a simulation study, we show that for our preferred method honest estimation results in nominal coverage for 90% confidence intervals, whereas coverage ranges between 74% and 84% for nonhonest approaches. Honest estimation requires estimating the model with a smaller sample size; the cost in terms of mean squared error of treatment effects for our preferred method ranges between 7–22%.

Journal Article

Share this book

Add to My Shelf

Causal inference in economics and marketing

by Varian, Hal R. in Computer Sciences , Economic Sciences , Physical Sciences

2016

This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual—a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.

Journal Article

Share this book

Add to My Shelf

Estimating peer effects in networks with peer encouragement designs

by Eckles, Dean , Kizilcec, René F. , Baks, Eytan in Humans , Peer Group , Physical Sciences

2016

Peer effects, in which the behavior of an individual is affected by the behavior of their peers, are central to social science. Because peer effects are often confounded with homophily and common external causes, recent work has used randomized experiments to estimate effects of specific peer behaviors. These experiments have often relied on the experimenter being able to randomly modulate mechanisms by which peer behavior is transmitted to a focal individual. We describe experimental designs that instead randomly assign individuals’ peers to encouragements to behaviors that directly affect those individuals. We illustrate this method with a large peer encouragement design on Facebook for estimating the effects of receiving feedback from peers on posts shared by focal individuals. We find evidence for substantial effects of receiving marginal feedback on multiple behaviors, including giving feedback to others and continued posting. These findings provide experimental evidence for the role of behaviors directed at specific individuals in the adoption and continued use of communication technologies. In comparison, observational estimates differ substantially, both underestimating and overestimating effects, suggesting that researchers and policy makers should be cautious in relying on them.

Journal Article

Share this book

Add to My Shelf

Modeling confounding by half-sibling regression

by Janzing, Dominik , Hogg, David W. , Peters, Jonas in COLLOQUIUM PAPER , Computer Sciences , Physical Sciences

2016

We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recentwork in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.

Journal Article

Share this book

Add to My Shelf

Hill's considerations are not causal criteria

by Pearce, Neil , Rothman, Kenneth J. , Savitz, David A. in Algorithms , Algorithms for causal inference , Bias

2026

Hill's list of considerations for assessing causality, proposed 60 years ago, became a landmark in the interpretation of epidemiologic evidence. However, it has been and continues to be misused as a list of causal criteria to be scored and summed, despite causal inference being unattainable through the application of this or any other algorithm. Recognizing the distinction between statistical associations and causal effects was a key contribution of Hill. While he identified several clues for distinguishing between causal and noncausal associations, causal inference in epidemiology has become much more explicit and effective. Rather than relying on Hill's indirect hints of potential bias by considering strength of association or dose-response gradients, newer methods such as quantitative bias analysis directly assess confounding and other candidate biases that compete with causal explanations, leading to more informed inferences. Similarly, the interpretation of consistency depends on variation in methods across studies; triangulation may be used to search for informative inconsistencies, strengthening causal inference. Most importantly, a causal connection is not a categorical property bestowed upon an association based on Hill's considerations or any other checklist. Causal inference is an inherently indirect process, with the inference gradually crystallizing by withstanding challenges from competing theories in which other explanations, including random error or biases, are found not to account for the measured association. •Hill's considerations have been misused as a checklist for causality.•Causal inference methods have advanced considerably after Hill's publication.•Causal inference is based on evidence from competing candidate explanations.

Journal Article

Share this book

Add to My Shelf

LIMITATIONS OF DESIGN-BASED CAUSAL INFERENCE AND A/B TESTING UNDER ARBITRARY AND NETWORK INTERFERENCE

by Basse, Guillaume W. , Airoldi, Edoardo M. in Assumptions , Bias , Causality

2018

Randomized experiments on a network often involve interference between connected units, namely, a situation in which an individual's treatment can affect the response of another individual Current approaches to deal with interference, in theory and in practice, often make restrictive assumptions on its structure—for instance, assuming that interference is local—even when using otherwise nonparametric inference strategies. This reliance on explicit restrictions on the interference mechanism suggests a shared intuition that inference is impossible without any assumptions on the interference structure. In this paper, we begin by formalizing this intuition in the context of a classical nonparametric approach to inference, referred to as design-based inference of causal effects. Next, we show how, always in the context of design-based inference, even parametric structural assumptions that allow the existence of unbiased estimators cannot guarantee a decreasing variance even in the large sample limit. This lack of concentration in large samples is often observed empirically, in randomized experiments in which interference of some form is expected to be present. This result has direct consequences for the design and analysis of large experiments—for instance, in online social platforms—where the belief is that large sample sizes automatically guarantee small variance. More broadly, our results suggest that although strategies for causal inference in the presence of interference borrow their formalism and main concepts from the traditional causal inference literature, much of the intuition from the no-interference case do not easily transfer to the interference setting.

Journal Article

Share this book

Add to My Shelf

Causal Inference and Observational Research: The Utility of Twins

by Christensen, Kaare , Osler, Merete , McGue, Matt in Ageing , Alcohol drinking , Applied psychology

2010

Valid causal inference is central to progress in theoretical and applied psychology. Although the randomized experiment is widely considered the gold standard for determining whether a given exposure increases the likelihood of some specified outcome, experiments are not always feasible and in some cases can result in biased estimates of causal effects. Alternatively, standard observational approaches are limited by the possibility of confounding, reverse causation, and the nonrandom distribution of exposure (i.e., selection). We describe the counter-factual model of causation and apply it to the challenges of causal inference in observational research, with a particular focus on aging. We argue that the study of twin pairs discordant on exposure, and in particular discordant monozygotic twins, provides a useful analog to the idealized counter-factual design. A review of discordant-twin studies in aging reveals that they are consistent with, but do not unambiguously establish, a causal effect of lifestyle factors on important late-life outcomes. Nonetheless, the existing studies are few in number and have clear limitations that have not always been considered in interpreting their results. It is concluded that twin researchers could make greater use of the discordant-twin design as one approach to strengthen causal inferences in observational research.

Journal Article

Share this book

Add to My Shelf

Improving massive experiments with threshold blocking

by Higgins, Michael J. , Sävje, Fredrik , Sekhon, Jasjeet S. in COLLOQUIUM PAPER , Economics , Nationalekonomi

2016

Inferences from randomized experiments can be improved by blocking: assigning treatment in fixed proportions within groups of similar units. However, the use of the method is limited by the difficulty in deriving these groups. Current blocking methods are restricted to special cases or run in exponential time; are not sensitive to clustering of data points; and are often heuristic, providing an unsatisfactory solution in many common instances. We present an algorithm that implements a widely applicable class of blocking—threshold blocking—that solves these problems. Given a minimum required group size and a distance metric, we study the blocking problem of minimizing the maximum distance between any two units within the same group. We prove this is a nondeterministic polynomial-time hard problem and derive an approximation algorithm that yields a blocking where the maximum distance is guaranteed to be, at most, four times the optimal value. This algorithm runs in O(n log n) time with O(n) space complexity. This makes it, to our knowledge, the first blocking method with an ensured level of performance that works in massive experiments. Whereas many commonly used algorithms form pairs of units, our algorithm constructs the groups flexibly for any chosen minimum size. This facilitates complex experiments with several treatment arms and clustered data. A simulation study demonstrates the efficiency and efficacy of the algorithm; tens of millions of units can be blocked using a desktop computer in a few minutes.

Journal Article

Share this book

Add to My Shelf

Management of Synanthropic Macaque (Macaca fascicularis) Populations in Bali: Assessing the Implications of Sterilization on Female Social Dynamics

by de Thier Nagelmackers, Fanny , Deleuze, Stefan , Wandia, I. Nengah in Aggression , Aggressiveness , Behavior

2025

Growing contacts between humans and nonhuman primates at interface zones bring forth the need to better understand the efficiency and implications of synanthropic primates population management strategies. In this context, the expanding use of fertility control contrasts with the limited documentation of its potential consequences for primate behavior and social dynamics. Unlike other methods, tubectomy preserves the ovarian functions involved in sexual motivation of female macaques. However, sexual behaviors and aggression could intensify due to a higher proportion of cycling females within the group. In this study, we assessed whether tubectomy modifies the sociosexual interactions of female long-tailed macaques ( Macaca fascicularis ) in a primate-tourism site in Bali, Indonesia. Using focal sampling over a three-year period ( N = 56 females), we investigated changes in (a) female sociosexual activities (i.e., sexual and grooming interactions with males), and (b) female intrasexual aggression (i.e., female-female agonistic interactions). Using causal inference statistics, we found that (a) compared with intact females, sterilized females were more sexually receptive and attractive, and they received longer grooming bouts from male partners. Surprisingly, (b) tubectomy was associated with decreased intrasexual aggression among females, as sterilized females received aggression from fewer female opponents compared with intact females. This study showed that, at least in the short term, tubectomy modifies the sociosexual interactions, while not heightening female aggression. These findings may inform management decisions that maximize social stability and welfare of synanthropic populations. However, the long-term implications of female sterility for social dynamics warrant further investigation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter