Catalogue Search | MBRL

Proximal Algorithms in Statistics and Machine Learning

by Polson, Nicholas G. , Willard, Brandon T. , Scott, James G. in ADMM , Artificial intelligence , Bayes MAP

2015

Proximal algorithms are useful for obtaining solutions to difficult optimization problems, especially those involving nonsmooth or composite objective functions. A proximal algorithm is one whose basic iterations involve the proximal operator of some function, whose evaluation requires solving a specific optimization problem that is typically easier than the original problem. Many familiar algorithms can be cast in this form, and this \"proximal view\" turns out to provide a set of broad organizing principles for many algorithms useful in statistics and machine learning. In this paper, we show how a number of recent advances in this area can inform modern statistical practice. We focus on several main themes: (1) variable splitting strategies and the augmented Lagrangian; (2) the broad utility of envelope (or variational) representations of objective functions; (3) proximal algorithms for composite objective functions; and (4) the surprisingly large number of functions for which there are closed-form solutions of proximal operators. We illustrate our methodology with regularized Logistic and Poisson regression incorporating a nonconvex bridge penalty and a fused lasso penalty. We also discuss several related issues, including the convergence of nondescent algorithms, acceleration and optimization for nonconvex functions. Finally, we provide directions for future research in this exciting area at the intersection of statistics and optimization.

Journal Article

Share this book

Add to My Shelf

Lasso Meets Horseshoe

by Bhadra, Anindya , Polson, Nicholas G. , Willard, Brandon in Bayesian analysis , Computational geometry , Convex analysis

2019

The goal of this paper is to contrast and survey the major advances in two of the most commonly used high-dimensional techniques, namely, the Lasso and horseshoe regularization. Lasso is a gold standard for predictor selection while horseshoe is a state-of-the-art Bayesian estimator for sparse signals. Lasso is fast and scalable and uses convex optimization whilst the horseshoe is nonconvex. Our novel perspective focuses on three aspects: (i) theoretical optimality in high-dimensional inference for the Gaussian sparse model and beyond, (ii) efficiency and scalability of computation and (iii) methodological development and performance.

Journal Article

Share this book

Add to My Shelf

Default Bayesian analysis with global-local shrinkage priors

by BHADRA, ANINDYA , DATTA, JYOTISHKA , WILLARD, BRANDON

2016

We provide a framework for assessing the default nature of a prior distribution using the property of regular variation, which we study for global-local shrinkage priors. In particular, we show that the horseshoe priors, originally designed to handle sparsity, are regularly varying and thus are appropriate for default Bayesian analysis. To illustrate our methodology, we discuss four problems of noninformative priors that have been shown to be highly informative for nonlinear functions. In each case, we show that global-local horseshoe priors perform as required. Global-local shrinkage priors can separate a low-dimensional signal from high-dimensional noise even for nonlinear functions.

Journal Article

Share this book

Add to My Shelf

The Horseshoe-Like Regularization for Feature Subset Selection

by Bhadra, Anindya , Polson, Nicholas G. , Willard, Brandon T. in Mathematics and Statistics , Statistics

2021

Feature subset selection arises in many high-dimensional applications of statistics, such as compressed sensing and genomics. The ℓ₀ penalty is ideal for this task, the caveat being it requires the NP-hard combinatorial evaluation of all models. A recent area of considerable interest is to develop efficient algorithms to fit models with a non-convex ℓγ penalty for γ ∈ (0, 1), which results in sparser models than the convex ℓ₁ or lasso penalty, but is harder to fit. We propose an alternative, termed the horseshoe regularization penalty for feature subset selection, and demonstrate its theoretical and computational advantages. The distinguishing feature from existing non-convex optimization approaches is a full probabilistic representation of the penalty as the negative of the logarithm of a suitable prior, which in turn enables efficient expectation-maximization and local linear approximation algorithms for optimization and MCMC for uncertainty quantification. In synthetic and real data, the resulting algorithms provide better statistical performance, and the computation requires a fraction of time of state-of-the-art non-convex solvers.

Journal Article

Share this book

Add to My Shelf

Global-Local Mixtures

by Bhadra, Anindya , Polson, Nicholas G. , Willard, Brandon T. in Mathematics and Statistics , Statistical Theory and Methods , Statistics

2020

Global-local mixtures, including Gaussian scale mixtures, have gained prominence in recent times, both as a sparsity inducing prior in p ≫ n problems as well as default priors for non-linear many-to-one functionals of highdimensional parameters. Here we propose a unifying framework for globallocal scale mixtures using the Cauchy-Schlömilch and Liouville integral transformation identities, and use the framework to build a new Bayesian sparse signal recovery method. This new method is a Bayesian counterpart of the √Lasso (Belloni et al., Biometrika 98, 4, 791–806, 2011) that adapts to unknown error variance. Our framework also characterizes well-known scale mixture distributions including the Laplace density used in Bayesian Lasso, logit and quantile via a single integral identity. Finally, we derive a few convolutions that commonly arise in Bayesian inference and posit a conjecture concerning bridge and uniform correlation mixtures.

Journal Article

Share this book

Add to My Shelf

Efficient Guided Generation for Large Language Models

by Louf, Rémi , Willard, Brandon T in Grammars , Large language models

2023

In this article we show how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine. This framework leads to an efficient approach to guiding text generation with regular expressions and context-free grammars by allowing the construction of an index over a language model's vocabulary. The approach is model agnostic, allows one to enforce domain-specific knowledge and constraints, and enables the construction of reliable interfaces by guaranteeing the structure of the generated text. It adds little overhead to the token sequence generation process and significantly outperforms existing solutions. An implementation is provided in the open source Python library Outlines

Paper

Share this book

Add to My Shelf

miniKanren as a Tool for Symbolic Computation in Python

by Willard, Brandon T in Computation , Crossovers , Machine learning

2020

In this article, we give a brief overview of the current state and future potential of symbolic computation within the Python statistical modeling and machine learning community. We detail the use of miniKanren as an underlying framework for term rewriting and symbolic mathematics, as well as its ability to orchestrate the use of existing Python libraries. We also discuss the relevance and potential of relational programming for implementing more robust, portable, domain-specific \"math-level\" optimizations--with a slight focus on Bayesian modeling. Finally, we describe the work going forward and raise some questions regarding potential cross-overs between statistical modeling and programming language theory.

Paper

Share this book

Add to My Shelf

Real-time On and Off Road GPS Tracking

by Willard, Brandon in Acceleration , Bayesian analysis , Distance learning

2014

This document describes a GPS-based tracking model for position and velocity states on and off of a road network and it enables parallel, online learning of state-dependent parameters, such as GPS error, acceleration error, and road transition probabilities. More specifically, the conditionally linear tracking model of Ulmke and Koch (2006) is adapted to the Particle Learning framework of H. F. Lopes, et. al. (2011), which provides a foundation for further hierarchical Bayesian extensions. The filter is shown to perform well on a real city road network while sufficiently estimating on and off road transition probabilities. The model in this paper is also backed by an open-source Java project.

Paper

Share this book

Add to My Shelf

Lasso Meets Horseshoe : A Survey

by Bhadra, Anindya , Polson, Nicholas G , Willard, Brandon T in Bayesian analysis , Computational geometry , Computing time

2019

The goal of this paper is to contrast and survey the major advances in two of the most commonly used high-dimensional techniques, namely, the Lasso and horseshoe regularization. Lasso is a gold standard for predictor selection while horseshoe is a state-of-the-art Bayesian estimator for sparse signals. Lasso is fast and scalable and uses convex optimization whilst the horseshoe is non-convex. Our novel perspective focuses on three aspects: (i) theoretical optimality in high dimensional inference for the Gaussian sparse model and beyond, (ii) efficiency and scalability of computation and (iii) methodological development and performance.

Paper

Share this book

Add to My Shelf

Horseshoe Regularization for Feature Subset Selection

by Bhadra, Anindya , Willard, Brandon , Polson, Nicholas G in Algorithms , Combinatorial analysis , Computational geometry

2017

Feature subset selection arises in many high-dimensional applications of statistics, such as compressed sensing and genomics. The \\(\\ell_0\\) penalty is ideal for this task, the caveat being it requires the NP-hard combinatorial evaluation of all models. A recent area of considerable interest is to develop efficient algorithms to fit models with a non-convex \\(\\ell_\\gamma\\) penalty for \\(\\gamma\\in (0,1)\\), which results in sparser models than the convex \\(\\ell_1\\) or lasso penalty, but is harder to fit. We propose an alternative, termed the horseshoe regularization penalty for feature subset selection, and demonstrate its theoretical and computational advantages. The distinguishing feature from existing non-convex optimization approaches is a full probabilistic representation of the penalty as the negative of the logarithm of a suitable prior, which in turn enables efficient expectation-maximization and local linear approximation algorithms for optimization and MCMC for uncertainty quantification. In synthetic and real data, the resulting algorithms provide better statistical performance, and the computation requires a fraction of time of state-of-the-art non-convex solvers.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter