Catalogue Search | MBRL

Model misspecification in approximate Bayesian computation

by Rousseau, Judith , Frazier, David T. , Robert, Christian P. in Approximate Bayesian computation , Asymptotic properties , Asymptotics

2020

We analyse the behaviour of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data-generating process, i.e. when the data simulator in ABC is misspecified.We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different results. Our theoretical results demonstrate that even though the model is misspecified, under regularity conditions, the accept–reject ABC approach concentrates posterior mass on an appropriately defined pseudotrue parameter value. However, under model misspecification the ABC posterior does not yield credible sets with valid frequentist coverage and has non-standard asymptotic behaviour. In addition, we examine the theoretical behaviour of the popular local regression adjustment to ABC under model misspecification and demonstrate that this approach concentrates posterior mass on a pseudotrue value that is completely different from accept–reject ABC. Using our theoretical results, we suggest two approaches to diagnose model misspecification in ABC. All theoretical results and diagnostics are illustrated in a simple running example.

Journal Article

Share this book

Add to My Shelf

Deciphering the Routes of invasion of Drosophila suzukii by Means of ABC Random Forest

by Singh, Nadia , Xuéreb, Anne , Richmond, Maxi Polihronakis in Bayesian analysis , Biological invasions , Datasets

2017

Deciphering invasion routes from molecular data is crucial to understanding biological invasions, including identifying bottlenecks in population size and admixture among distinct populations. Here, we unravel the invasion routes of the invasive pest Drosophila suzukii using a multi-locus microsatellite dataset (25 loci on 23 worldwide sampling locations). To do this, we use approximate Bayesian computation (ABC), which has improved the reconstruction of invasion routes, but can be computationally expensive. We use our study to illustrate the use of a new, more efficient, ABC method, ABC random forest (ABC-RF) and compare it to a standard ABC method (ABC-LDA). We find that Japan emerges as the most probable source of the earliest recorded invasion into Hawaii. Southeast China and Hawaii together are the most probable sources of populations in western North America, which then in turn served as sources for those in eastern North America. European populations are genetically more homogeneous than North American populations, and their most probable source is northeast China, with evidence of limited gene flow from the eastern US as well. All introduced populations passed through bottlenecks, and analyses reveal five distinct admixture events. These findings can inform hypotheses concerning how this species evolved between different and independent source and invasive populations. Methodological comparisons indicate that ABC-RF and ABC-LDA show concordant results if ABC-LDA is based on a large number of simulated datasets but that ABC-RF out-performs ABC-LDA when using a comparable and more manageable number of simulated datasets, especially when analyzing complex introduction scenarios.

Journal Article

Share this book

Add to My Shelf

Building up biogeography: Pattern to process

by Davies, T. Jonathan , Harte, John , Barbosa, A. Márcia in approximate bayesian computation , approximate Bayesian computation (ABC) , Bayesian analysis

2018

Linking pattern to process across spatial and temporal scales has been a key goal of the field of biogeography. In January 2017, the 8th biennial conference of the International Biogeography Society sponsored a symposium on Building up biogeography—process to pattern that aimed to review progress towards this goal. Here we present a summary of the symposium, in which we identified promising areas of current research and suggested future research directions. We focus on (1) emerging types of data such as behavioural observations and ancient DNA, (2) how to better incorporate historical data (such as fossils) to move beyond what we term \"footprint measures\" of past dynamics and (3) the role that novel modelling approaches (e.g. maximum entropy theory of ecology and approximate Bayesian computation) and conceptual frameworks can play in the unification of disciplines. We suggest that the gaps separating pattern and process are shrinking, and that we can better bridge these aspects by considering the dimensions of space and time simultaneously.

Journal Article

Share this book

Add to My Shelf

Bayesian Synthetic Likelihood

by Price, L. F. , Drovandi, C. C. , Lee, A. in Statistical Computing

2018

Having the ability to work with complex models can be highly beneficial. However, complex models often have intractable likelihoods, so methods that involve evaluation of the likelihood function are infeasible. In these situations, the benefits of working with likelihood-free methods become apparent. Likelihood-free methods, such as parametric Bayesian indirect likelihood that uses the likelihood of an alternative parametric auxiliary model, have been explored throughout the literature as a viable alternative when the model of interest is complex. One of these methods is called the synthetic likelihood (SL), which uses a multivariate normal approximation of the distribution of a set of summary statistics. This article explores the accuracy and computational efficiency of the Bayesian version of the synthetic likelihood (BSL) approach in comparison to a competitor known as approximate Bayesian computation (ABC) and its sensitivity to its tuning parameters and assumptions. We relate BSL to pseudo-marginal methods and propose to use an alternative SL that uses an unbiased estimator of the SL, when the summary statistics have a multivariate normal distribution. Several applications of varying complexity are considered to illustrate the findings of this article.

Journal Article

Share this book

Add to My Shelf

The frontier of simulation-based inference

by Cranmer, Kyle , Louppe, Gilles , Brehmer, Johann in Approximate Bayesian Computation , COLLOQUIUM PAPERS , Computer science

2020

Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound influence these developments may have on science.

Journal Article

Share this book

Add to My Shelf

Approximate Bayesian computation with the Wasserstein distance

by Robert, Christian P. , Gerber, Mathieu , Jacob, Pierre E. in Approximate Bayesian computation , Bayesian analysis , Bayesian theory

2019

A growing number of generative statistical models do not permit the numerical evaluation of their likelihood functions. Approximate Bayesian computation has become a popular approach to overcome this issue, in which one simulates synthetic data sets given parameters and compares summaries of these data sets with the corresponding observed values. We propose to avoid the use of summaries and the ensuing loss of information by instead using the Wasserstein distance between the empirical distributions of the observed and synthetic data. This generalizes the well-known approach of using order statistics within approximate Bayesian computation to arbitrary dimensions. We describe how recently developed approximations of the Wasserstein distance allow the method to scale to realistic data sizes, and we propose a new distance based on the Hilbert space filling curve. We provide a theoretical study of the method proposed, describing consistency as the threshold goes to 0 while the observations are kept fixed, and concentration properties as the number of observations grows. Various extensions to time series data are discussed. The approach is illustrated on various examples, including univariate and multivariate g-and-k distributions, a toggle switch model from systems biology, a queuing model and a Lévy-driven stochastic volatility model.

Journal Article

Share this book

Add to My Shelf

Fundamentals and Recent Developments in Approximate Bayesian Computation

by Gutmann, Michael U. , Corander, Jukka , Dutta, Ritabrata in Algorithms , Bayes Theorem , Bayesian analysis

2017

Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.]

Journal Article

Share this book

Add to My Shelf

Phylogenomics Reveals an Ancient Hybrid Origin of the Persian Walnut

by Bo-Wen, Zhang , Lin, Kui , Woeste, Keith E in Bayesian analysis , Computation , Genomes

2019

Persian walnut (Juglans regia) is cultivated worldwide for its high-quality wood and nuts, but its origin has remained mysterious because in phylogenies it occupies an unresolved position between American black walnuts and Asian butternuts. Equally unclear is the origin of the only American butternut, J. cinerea. We resequenced the whole genome of 80 individuals from 19 of the 22 species of Juglans and assembled the genome of its relatives Pterocarya stenoptera and Platycarya strobilacea. Using phylogenetic-network analysis of single-copy nuclear genes, genome-wide site pattern probabilities, and Approximate Bayesian Computation, we discovered that J. regia (and its landrace J. sigillata) arose as a hybrid between the American and the Asian lineages and that J. cinerea resulted from massive introgression from an immigrating Asian butternut into the genome of an American black walnut. Approximate Bayesian Computation modeling placed the hybrid origin in the late Pliocene, ∼3.45 My, with both parental lineages since having gone extinct in Europe.

Journal Article

Share this book

Add to My Shelf

LEARNING SUMMARY STATISTIC FOR APPROXIMATE BAYESIAN COMPUTATION VIA DEEP NEURAL NETWORK

by Jiang, Bai , Zheng, Charles , Wong, Wing H.

2017

Approximate Bayesian Computation (ABC) methods are used to approximate posterior distributions in models with unknown or computationally intractable likelihoods. Both the accuracy and computational efficiency of ABC depend on the choice of summary statistic, but outside of special cases where the optimal summary statistics are known, it is unclear which guiding principles can be used to construct effective summary statistics. In this paper we explore the possibility of automating the process of constructing summary statistics by training deep neural networks to predict the parameters from artificially generated data: the resulting summary statistics are approximately posterior means of the parameters. With minimal model-specific tuning, our method constructs summary statistics for the Ising model and the moving-average model, which match or exceed theoretically-motivated summary statistics in terms of the accuracies of the resulting posteriors.

Journal Article

Share this book

Add to My Shelf

Toward an Evolutionarily Appropriate Null Model: Jointly Inferring Demography and Purifying Selection

by Jensen, Jeffrey D , Charlesworth, Brian , Johri, Parul in Animals , Demographics , Demography

2020

Abstract The relative evolutionary roles of adaptive and non-adaptive processes remain a central question in population genetics. Resolution of this debate has been difficult as an appropriate null model incorporating... The question of the relative evolutionary roles of adaptive and nonadaptive processes has been a central debate in population genetics for nearly a century. While advances have been made in the theoretical development of the underlying models, and statistical methods for estimating their parameters from large-scale genomic data, a framework for an appropriate null model remains elusive. A model incorporating evolutionary processes known to be in constant operation, genetic drift (as modulated by the demographic history of the population) and purifying selection, is lacking. Without such a null model, the role of adaptive processes in shaping within- and between-population variation may not be accurately assessed. Here, we investigate how population size changes and the strength of purifying selection affect patterns of variation at “neutral” sites near functional genomic components. We propose a novel statistical framework for jointly inferring the contribution of the relevant selective and demographic parameters. By means of extensive performance analyses, we quantify the utility of the approach, identify the most important statistics for parameter estimation, and compare the results with existing methods. Finally, we reanalyze genome-wide population-level data from a Zambian population of Drosophila melanogaster, and find that it has experienced a much slower rate of population growth than was inferred when the effects of purifying selection were neglected. Our approach represents an appropriate null model, against which the effects of positive selection can be assessed.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter