Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
89,605
result(s) for
"Bayesian Analysis"
Sort by:
Variational Inference: A Review for Statisticians
by
Blei, David M.
,
Kucukelbir, Alp
,
McAuliffe, Jon D.
in
Algorithms
,
Americans
,
artificial intelligence
2017
One of the core problems of modern statistics is to approximate difficult-to-compute probability densities. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation involving the posterior density. In this article, we review variational inference (VI), a method from machine learning that approximates probability densities through optimization. VI has been used in many applications and tends to be faster than classical methods, such as Markov chain Monte Carlo sampling. The idea behind VI is to first posit a family of densities and then to find a member of that family which is close to the target density. Closeness is measured by Kullback-Leibler divergence. We review the ideas behind mean-field variational inference, discuss the special case of VI applied to exponential family models, present a full example with a Bayesian mixture of Gaussians, and derive a variant that uses stochastic optimization to scale up to massive data. We discuss modern research in VI and highlight important open problems. VI is powerful, but it is not yet well understood. Our hope in writing this article is to catalyze statistical research on this class of algorithms. Supplementary materials for this article are available online.
Journal Article
Communication-Efficient Distributed Statistical Inference
by
Yang, Yun
,
Lee, Jason D.
,
Jordan, Michael I.
in
Algorithms
,
Bayesian analysis
,
Bayesian theory
2019
We present a communication-efficient surrogate likelihood (CSL) framework for solving distributed statistical inference problems. CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation, and Bayesian inference. For low-dimensional estimation, CSL provably improves upon naive averaging schemes and facilitates the construction of confidence intervals. For high-dimensional regularized estimation, CSL leads to a minimax-optimal estimator with controlled communication cost. For Bayesian inference, CSL can be used to form a communication-efficient quasi-posterior distribution that converges to the true posterior. This quasi-posterior procedure significantly improves the computational efficiency of Markov chain Monte Carlo (MCMC) algorithms even in a nondistributed setting. We present both theoretical analysis and experiments to explore the properties of the CSL approximation. Supplementary materials for this article are available online.
Journal Article
Bayesian regression tree ensembles that adapt to smoothness and sparsity
2018
Ensembles of decision trees are a useful tool for obtaining flexible estimates of regression functions. Examples of these methods include gradient-boosted decision trees, random forests and Bayesian classification and regression trees. Two potential shortcomings of tree ensembles are their lack of smoothness and their vulnerability to the curse of dimensionality. We show that these issues can be overcome by instead considering sparsity inducing soft decision trees in which the decisions are treated as probabilistic. We implement this in the context of the Bayesian additive regression trees framework and illustrate its promising performance through testing on benchmark data sets. We provide strong theoretical support for our methodology by showing that the posterior distribution concentrates at the minimax rate (up to a logarithmic factor) for sparse functions and functions with additive structures in the high dimensional regime where the dimensionality of the covariate space is allowed to grow nearly exponentially in the sample size. Our method also adapts to the unknown smoothness and sparsity levels, and can be implemented by making minimal modifications to existing Bayesian additive regression tree algorithms.
Journal Article
A general framework for updating belief distributions
2016
We propose a framework for general Bayesian inference. We argue that a valid update of a prior belief distribution to a posterior can be made for parameters which are connected to observations through a loss function rather than the traditional likelihood function, which is recovered as a special case. Modern application areas make it increasingly challenging for Bayesians to attempt to model the true data-generating mechanism. For instance, when the object of interest is low dimensional, such as a mean or median, it is cumbersome to have to achieve this via a complete model for the whole data distribution. More importantly, there are settings where the parameter of interest does not directly index a family of density functions and thus the Bayesian approach to learning about such parameters is currently regarded as problematic. Our framework uses loss functions to connect information in the data to functionals of interest. The updating of beliefs then follows from a decision theoretic approach involving cumulative loss functions. Importantly, the procedure coincides with Bayesian updating when a true likelihood is known yet provides coherent subjective inference in much more general settings. Connections to other inference frameworks are highlighted.
Journal Article
Dirichlet–Laplace Priors for Optimal Shrinkage
by
Pillai, Natesh S.
,
Dunson, David B.
,
Pati, Debdeep
in
Bayesian
,
Bayesian analysis
,
Bayesian method
2015
Penalized regression methods, such as L ₁ regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In contrast to the frequentist literature, little is known about the properties of such priors and the convergence and concentration of the corresponding posterior distribution. In this article, we propose a new class of Dirichlet–Laplace priors, which possess optimal posterior concentration and lead to efficient posterior computation. Finite sample performance of Dirichlet–Laplace priors relative to alternatives is assessed in simulated and real data examples.
Journal Article
Model misspecification in approximate Bayesian computation
by
Rousseau, Judith
,
Frazier, David T.
,
Robert, Christian P.
in
Approximate Bayesian computation
,
Asymptotic properties
,
Asymptotics
2020
We analyse the behaviour of approximate Bayesian computation (ABC) when the model generating the simulated data differs from the actual data-generating process, i.e. when the data simulator in ABC is misspecified.We demonstrate both theoretically and in simple, but practically relevant, examples that when the model is misspecified different versions of ABC can yield substantially different results. Our theoretical results demonstrate that even though the model is misspecified, under regularity conditions, the accept–reject ABC approach concentrates posterior mass on an appropriately defined pseudotrue parameter value. However, under model misspecification the ABC posterior does not yield credible sets with valid frequentist coverage and has non-standard asymptotic behaviour. In addition, we examine the theoretical behaviour of the popular local regression adjustment to ABC under model misspecification and demonstrate that this approach concentrates posterior mass on a pseudotrue value that is completely different from accept–reject ABC. Using our theoretical results, we suggest two approaches to diagnose model misspecification in ABC. All theoretical results and diagnostics are illustrated in a simple running example.
Journal Article
Frequentist Consistency of Variational Bayes
2019
A key challenge for modern Bayesian statistics is how to perform scalable inference of posterior distributions. To address this challenge, variational Bayes (VB) methods have emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) methods. VB methods tend to be faster while achieving comparable predictive performance. However, there are few theoretical results around VB. In this article, we establish frequentist consistency and asymptotic normality of VB methods. Specifically, we connect VB methods to point estimates based on variational approximations, called frequentist variational approximations, and we use the connection to prove a variational Bernstein-von Mises theorem. The theorem leverages the theoretical characterizations of frequentist variational approximations to understand asymptotic properties of VB. In summary, we prove that (1) the VB posterior converges to the Kullback-Leibler (KL) minimizer of a normal distribution, centered at the truth and (2) the corresponding variational expectation of the parameter is consistent and asymptotically normal. As applications of the theorem, we derive asymptotic properties of VB posteriors in Bayesian mixture models, Bayesian generalized linear mixed models, and Bayesian stochastic block models. We conduct a simulation study to illustrate these theoretical results. Supplementary materials for this article are available online.
Journal Article