Catalogue Search | MBRL

NONPARAMETRIC STATISTICAL INFERENCE FOR DRIFT VECTOR FIELDS OF MULTI-DIMENSIONAL DIFFUSIONS

by Nickl, Richard , Ray, Kolyan in Brownian motion , Covariance , Differential equations

2020

The problem of determining a periodic Lipschitz vector field b = (b₁, . . . , bd ) from an observed trajectory of the solution (Xt : 0 ≤ t ≤ T) of the multi-dimensional stochastic differential equation d Xt = b (Xt) dt + d Wt , t ≥ 0, where Wt is a standard d-dimensional Brownian motion, is considered. Convergence rates of a penalised least squares estimator, which equals the maximum a posteriori (MAP) estimate corresponding to a high-dimensional Gaussian product prior, are derived. These results are deduced from corresponding contraction rates for the associated posterior distributions. The rates obtained are optimal up to log-factors in L²-loss in any dimension, and also for supremum norm loss when d ≤ 4. Further, when d ≤ 3, nonparametric Bernstein–von Mises theorems are proved for the posterior distributions of b. From this, we deduce functional central limit theorems for the implied estimators of the invariant measure μb . The limiting Gaussian process distributions have a covariance structure that is asymptotically optimal from an information-theoretic point of view.

Journal Article

Share this book

Add to My Shelf

Semiparametric mixed-scale models using shared Bayesian forests

by Sinha, Debajyoti , Linero, Antonio R. , Lipsitz, Stuart R. in Bayesian additive regression trees , Bayesian analysis , Bayesian theory

2020

This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semicontinuous responses. In this paper, we present a methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum-of-tree models. Our simulation results demonstrate that sharing of information across related model components is often very beneficial, particularly in sparse high-dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive regression trees—a heteroskedastic log-normal hurdle model with a “shrinktoward-homoskedasticity” prior and a gamma hurdle model.

Journal Article

Share this book

Add to My Shelf

On Recursive Bayesian Predictive Distributions

by Walker, Stephen G. , Martin, Ryan , Hahn, P. Richard in Algorithms , Bayesian analysis , Bayesian theory

2018

A Bayesian framework is attractive in the context of prediction, but a fast recursive update of the predictive distribution has apparently been out of reach, in part because Monte Carlo methods are generally used to compute the predictive. This article shows that online Bayesian prediction is possible by characterizing the Bayesian predictive update in terms of a bivariate copula, making it unnecessary to pass through the posterior to update the predictive. In standard models, the Bayesian predictive update corresponds to familiar choices of copula but, in nonparametric problems, the appropriate copula may not have a closed-form expression. In such cases, our new perspective suggests a fast recursive approximation to the predictive density, in the spirit of Newton's predictive recursion algorithm, but without requiring evaluation of normalizing constants. Consistency of the new algorithm is shown, and numerical examples demonstrate its quality performance in finite-samples compared to fully Bayesian and kernel methods. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

Bayesian Approaches for Missing Not at Random Outcome Data

by Linero, Antonio R. , Daniels, Michael J. in Bayesian analysis , Constrictions , Data analysis

2018

Missing data is almost always present in real datasets, and introduces several statistical issues. One fundamental issue is that, in the absence of strong uncheckable assumptions, effects of interest are typically not nonparametrically identified. In this article, we review the generic approach of the use of identifying restrictions from a likelihood-based perspective, and provide points of contact for several recently proposed methods. An emphasis of this review is on restrictions for nonmonotone missingness, a subject that has been treated sparingly in the literature. We also present a general, fully Bayesian, approach which is widely applicable and capable of handling a variety of identifying restrictions in a uniform manner.

Journal Article

Share this book

Add to My Shelf

BAYESIAN MANIFOLD REGRESSION

by Yang, Yun , Dunson, David B. in 62-07 , 62H30 , 65U05

2016

There is increasing interest in the problem of nonparametric regression with high-dimensional predictors. When the number of predictors D is large, one encounters a daunting problem in attempting to estimate a D-dimensional surface based on limited data. Fortunately, in many applications, the support of the data is concentrated on a d-dimensional subspace with d « D. Manifold learning attempts to estimate this subspace. Our focus is on developing computationally tractable and theoretically supported Bayesian nonparametric regression methods in this context. When the subspace corresponds to a locally-Euclidean compact Riemannian manifold, we show that a Gaussian process regression approach can be applied that leads to the minimax optimal adaptive rate in estimating the regression function under some conditions. The proposed model bypasses the need to estimate the manifold, and can be implemented using standard algorithms for posterior computation in Gaussian processes. Finite sample performance is illustrated in a data analysis example.

Journal Article

Share this book

Add to My Shelf

Adaptive Bayesian Procedures Using Random Series Priors

by Ghosal, Subhashis , Shen, Weining in adaptation , B-spline , Bayesian analysis

2015

We consider a general class of prior distributions for nonparametric Bayesian estimation which uses finite random series with a random number of terms. A prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior contraction rates for all smoothness levels of the target function in the true model by constructing an appropriate 'sieve' and applying the general theory of posterior contraction rates. We apply this general result on several statistical problems such as density estimation, various nonparametric regressions, classification, spectral density estimation and functional regression. The prior can be viewed as an alternative to the commonly used Gaussian process prior, but properties of the posterior distribution can be analysed by relatively simpler techniques. An interesting approximation property of B-spline basis expansion established in this paper allows a canonical choice of prior on coefficients in a random series and allows a simple computational approach without using Markov chain Monte Carlo methods. A simulation study is conducted to show that the accuracy of the Bayesian estimators based on the random series prior and the Gaussian process prior are comparable. We apply the method on Tecator data using functional regression models.

Journal Article

Share this book

Add to My Shelf

Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence

by Reiter, Jerome P. , Murray, Jared S. in Applications and Case Studies , Bayesian analysis , Bayesian method

2016

We present a nonparametric Bayesian joint model for multivariate continuous and categorical variables, with the intention of developing a flexible engine for multiple imputation of missing values. The model fuses Dirichlet process mixtures of multinomial distributions for categorical variables with Dirichlet process mixtures of multivariate normal distributions for continuous variables. We incorporate dependence between the continuous and categorical variables by (1) modeling the means of the normal distributions as component-specific functions of the categorical variables and (2) forming distinct mixture components for the categorical and continuous data with probabilities that are linked via a hierarchical model. This structure allows the model to capture complex dependencies between the categorical and continuous data with minimal tuning by the analyst. We apply the model to impute missing values due to item nonresponse in an evaluation of the redesign of the Survey of Income and Program Participation (SIPP). The goal is to compare estimates from a field test with the new design to estimates from selected individuals from a panel collected under the old design. We show that accounting for the missing data changes some conclusions about the comparability of the distributions in the two datasets. We also perform an extensive repeated sampling simulation using similar data from complete cases in an existing SIPP panel, comparing our proposed model to a default application of multiple imputation by chained equations. Imputations based on the proposed model tend to have better repeated sampling properties than the default application of chained equations in this realistic setting. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

NONPARAMETRIC BAYESIAN TWO-LEVEL CLUSTERING FOR SUBJECT-LEVEL SINGLE-CELL EXPRESSION DATA

by Luo, Xiangyu , Wu, Qiuyu

2022

The advent of single-cell sequencing opens new avenues for personalized treatment. In this study, we address a two-level clustering problem of simultaneous subject subgroup discovery (subject level) and cell type detection (cell level) for single-cell expression data from multiple subjects. Current statistical approaches either cluster cells without considering the subject heterogeneity, or group subjects without using the single-cell information. To bridge the gap between cell clustering and subject grouping, we develop a nonparametric Bayesian model, Subject and Cell clustering for Single-Cell expression data (SCSC) model, to achieve subject and cell grouping simultaneously. The SCSC model does not need to prespecify the subject subgroup number or the cell type number. It automatically induces subject subgroup structures and matches cell types across subjects. Moreover, it directly models the single-cell raw count data by deliberately considering the data's dropouts, library sizes, and over-dispersion. A blocked Gibbs sampler is proposed for the posterior inference. Simulation studies and an application to a multi-subject induced pluripotent stem cell single-cell RNA sequencing data set validate the ability of the SCSC model to simultaneously cluster subjects and cells.

Journal Article

Share this book

Add to My Shelf

BAYESIAN NONPARAMETRIC MULTIVARIATE SPATIAL MIXTURE MIXED EFFECTS MODELS WITH APPLICATION TO AMERICAN COMMUNITY SURVEY SPECIAL TABULATIONS

by Holan, Scott H. , Janicki, Ryan , Raim, Andrew M.

2022

Leveraging multivariate spatial dependence to improve the precision of estimates using American Community Survey data and other sample survey data has been a topic of recent interest among data users and federal statistical agencies. One strategy is to use a multivariate spatial mixed effects model with a Gaussian observation model and latent Gaussian process model. In practice, this works well for a wide range of tabulations. Nevertheless, in situations in which the data exhibit heterogeneity within or across geographies, and/or there is sparsity in the data, the Gaussian assumptions may be problematic and lead to underperformance. To remedy these situations, we propose a multivariate hierarchical Bayesian nonparametric mixed effects spatial mixture model to increase model flexibility. The number of clusters is chosen automatically in a data-driven manner. The effectiveness of our approach is demonstrated through a simulation study and motivating application of special tabulations for American Community Survey data.

Journal Article

Share this book

Add to My Shelf

Nonparametric Bayes dynamic modelling of relational data

by DURANTE, DANIELE , DUNSON, DAVID B. in Bayesian analysis , Coordinate systems , Covariance matrices

2014

Symmetric binary matrices representing relations are collected in many areas. Our focus is on dynamically evolving binary relational matrices, with interest being on inference on the relationship structure and prediction. We propose a nonparametric Bayesian dynamic model, which reduces dimensionality in characterizing the binary matrix through a lower-dimensional latent space representation, with the latent coordinates evolving in continuous time via Gaussian processes. By using a logistic mapping function from the link probability matrix space to the latent relational space, we obtain a flexible and computationally tractable formulation. Employing Pólya-gamma data augmentation, an efficient Gibbs sampler is developed for posterior computation, with the dimension of the latent space automatically inferred. We provide theoretical results on flexibility of the model, and illustrate its performance via simulation experiments. We also consider an application to co-movements in world financial markets.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter