Catalogue Search | MBRL

Sufficient Dimension Reduction via Random-Partitions for the Large-p-Small-n Problem

by Hung, Hung , Huang, Su-Yun in Alcoholism - pathology , algebra , Algorithms

2019

Sufficient dimension reduction (SDR) continues to be an active field of research. When estimating the central subspace (CS), inverse regression based SDR methods involve solving a generalized eigenvalue problem, which can be problematic under the large-p-small-n situation. In recent years, new techniques have emerged in numerical linear algebra, called randomized algorithms or random sketching, for high-dimensional and large scale problems. To overcome the large-p-small-n SDR problem, we combine the idea of statistical inference with random sketching to propose a new SDR method, called integrated random-partition SDR (iRP-SDR). Our method consists of the following three steps: (i) Randomly partition the covariates into subsets to construct an envelope subspace with low dimension. (ii) Obtain a sketch of the CS by applying a conventional SDR method within the constructed envelope subspace. (iii) Repeat the above two steps many times and integrate these multiple sketches to form the final estimate of the CS. After describing the details of these steps, the asymptotic properties of iRP-SDR are established. Unlike existing methods, iRP-SDR does not involve the determination of the structural dimension until the last stage, which makes it more adaptive to a high-dimensional setting. The advantageous performance of iRP-SDR is demonstrated via simulation studies and a practical example analyzing EEG data.

Journal Article

Share this book

Add to My Shelf

CONDITIONAL FORMULAE FOR GIBBS-TYPE EXCHANGEABLE RANDOM PARTITIONS

by Prünster, Igor , Favaro, Stefano , Lijoi, Antonio in 60G57 , 62F15 , 62G05

2013

Gibbs-type random probability measures and the exchangeable random partitions they induce represent an important framework both from a theoretical and applied point of view. In the present paper, motivated by species sampling problems, we investigate some properties concerning the conditional distribution of the number of blocks with a certain frequency generated by Gibbs-type random partitions. The general results are then specialized to three noteworthy examples yielding completely explicit expressions of their distributions, moments and asymptotic behaviors. Such expressions can be interpreted as Bayesian nonparametric estimators of the rare species variety and their performance is tested on some real genomic data.

Journal Article

Share this book

Add to My Shelf

Looking-backward probabilities for Gibbs-type exchangeable random partitions

by FAVARO, STEFANO , BACALLADO, SERGIO , TRIPPA, LORENZO in Bayesian nonparametrics , conditional random partitions , Ewens–Pitman sampling model

2015

Gibbs-type random probability measures and the exchangeable random partitions they induce represent the subject of a rich and active literature. They provide a probabilistic framework for a wide range of theoretical and applied problems that are typically referred to as species sampling problems. In this paper, we consider the class of looking-backward species sampling problems introduced in Lijoi et al. (Ann. Appl.Probab. 18 (2008) 1519-1547) in Bayesian nonparametrics. Specifically, given some information on the random partition induced by an initial sample from a Gibbs-type random probability measure, we study the conditional distributions of statistics related to the old species, namely those species detected in the initial sample and possibly re-observed in an additional sample. The proposed results contribute to the analysis of conditional properties of Gibbs-type exchangeable random partitions, so far focused mainly on statistics related to those species generated by the additional sample and not already detected in the initial sample.

Journal Article

Share this book

Add to My Shelf

The Ubiquitous Ewens Sampling Formula

by Crane, Harry in Algebra , Alleles , alpha-permanent

2016

Ewens's sampling formula exemplifies the harmony of mathematical theory, statistical application, and scientific discovery. The formula not only contributes to the foundations of evolutionary molecular genetics, the neutral theory of biodiversity, Bayesian nonparametrics, combinatorial stochastic processes, and inductive inference but also emerges from fundamental concepts in probability theory, algebra, and number theory. With an emphasis on its far-reaching influence throughout statistics and probability, we highlight these and many other consequences of Ewens's seminal discovery.

Journal Article

Share this book

Add to My Shelf

On the Pitman–Yor process with spike and slab base measure

by CANALE, A. , NIPOTI, B. , LIJOI, A.

2017

For the most popular discrete nonparametric models, beyond the Dirichlet process, the prior guess at the shape of the data-generating distribution, also known as the base measure, is assumed to be diffuse. Such a specification greatly simplifies the derivation of analytical results, allowing for a straightforward implementation of Bayesian nonparametric inferential procedures. However, in several applied problems the available prior information leads naturally to the incorporation of an atom into the base measure, and then the Dirichlet process is essentially the only tractable choice for the prior. In this paper we fill this gap by considering the Pitman–Yor process with an atom in its base measure. We derive computable expressions for the distribution of the induced random partitions and for the predictive distributions. These findings allow us to devise an effective generalized Pólya urn Gibbs sampler. Applications to density estimation, clustering and curve estimation, with both simulated and real data, serve as an illustration of our results and allow comparisons with existing methodology. In particular, we tackle a functional data analysis problem concerning basal body temperature curves.

Journal Article

Share this book

Add to My Shelf

Bayesian inference with dependent normalized completely random measures

by NIPOTI, BERNARDO , PRÜNSTER, IGOR , LIJOI, ANTONIO in Bayesian inference , completely random measure , Datasets

2014

The proposal and study of dependent prior processes has been a major research focus in the recent Bayesian nonparametric literature. In this paper, we introduce a flexible class of dependent nonparametric priors, investigate their properties and derive a suitable sampling scheme which allows their concrete implementation. The proposed class is obtained by normalizing dependent completely random measures, where the dependence arises by virtue of a suitable construction of the Poisson random measures underlying the completely random measures. We first provide general distributional results for the whole class of dependent completely random measures and then we specialize them to two specific priors, which represent the natural candidates for concrete implementation due to their analytic tractability: the bivariate Dirichlet and normalized σ-stable processes. Our analytical results, and in particular the partially exchangeable partition probability function, form also the basis for the determination of a Markov Chain Monte Carlo algorithm for drawing posterior inferences, which reduces to the well-known Blackwell-MacQueen Pólya urn scheme in the univariate case. Such an algorithm can be used for density estimation and for analyzing the clustering structure of the data and is illustrated through a real two-sample dataset example.

Journal Article

Share this book

Add to My Shelf

A representation of exchangeable hierarchies by sampling from random real trees

by Haulk, Chris , Pitman, Jim , man, Noah in Hierarchies , Independent variables , Integers

2018

A hierarchy on a set S, also called a total partition of S, is a collection \\[ H\\] of subsets of S such that \\[S ın H\\], each singleton subset of S belongs to \\[ H\\], and if \\[A, B ın H\\] then \\[A B\\] equals either A or B or \\[ \\]. Every exchangeable random hierarchy of positive integers has the same distribution as a random hierarchy \\[ H\\] associated as follows with a random real tree \\[ T\\] equipped with root element 0 and a random probability distribution p on the Borel subsets of \\[ T\\]: given \\[( T,p)\\], let \\[t_1,t_2, \\] be independent and identically distributed according to p, and let \\[ H\\] comprise all singleton subsets of \\[ N\\], and every subset of the form \\[\\j:t_j ın F(x)\\\] as x ranges over \\[ T\\], where F(x) is the fringe subtree of \\[ T\\] rooted at x. There is also the alternative characterization: every exchangeable random hierarchy of positive integers has the same distribution as a random hierarchy \\[ H\\] derived as follows from a random hierarchy \\[ H\\] on [0, 1] and a family \\[(U_j)\\] of i.i.d. Uniform [0,1] random variables independent of \\[ H\\]: let \\[ H\\] comprise all sets of the form \\[\\j:U_j ın B\\\] as B ranges over the members of \\[ H\\].

Journal Article

Share this book

Add to My Shelf

A Space-Time Skew-t Model for Threshold Exceedances

by Morris, Samuel A. , Thibaud, Emeric , Reich, Brian J. in Air Pollutants , Air quality , Asymptotic methods

2017

To assess the compliance of air quality regulations, the Environmental Protection Agency (EPA) must know if a site exceeds a pre-specified level. In the case of ozone, the level for compliance is fixed at 75 parts per billion, which is high, but not extreme at all locations. We present a new space-time model for threshold exceedances based on the skew-t process. Our method incorporates a random partition to permit long-distance asymptotic independence while allowing for sites that are near one another to be asymptotically dependent, and we incorporate thresholding to allow the tails of the data to speak for themselves. We also introduce a transformed AR(1) time-series to allow for temporal dependence. Finally, our model allows for high-dimensional Bayesian inference that is comparable in computation time to traditional geostatistical methods for large data sets. We apply our method to an ozone analysis for July 2005, and find that our model improves over both Gaussian and max-stable methods in terms of predicting exceedances of a high level.

Journal Article

Share this book

Add to My Shelf

On the average time complexity of computation with random partition

by Liao, Mingxue , Lv, Pin in Algorithms , Combinatorial analysis , Complexity

2024

Some computations are based on structures of random partition. They take an n-size problem as input, then break this problem into sub-problems of randomized size, execute calculations on each sub-problems and combine results from these calculations at last. We propose a combinatorial method for analyzing such computations and prove that the averaged time complexity is in terms of Stirling numbers of the second kind. The result shows that the average time complexity is decreased about one order of magnitude compared to that of the original solution. We also show two application cases where random partition structures are applied to improve performance.

Journal Article

Share this book

Add to My Shelf

A restaurant process with cocktail bar and relations to the three-parameter mittag–leffler distribution

by Möhle, Martin in Champagne , Customers , Markov analysis

2021

In addition to the features of the two-parameter Chinese restaurant process (CRP), the restaurant under consideration has a cocktail bar and hence allows for a wider range of (bar and table) occupancy mechanisms. The model depends on three real parameters, $\\alpha$ , $\\theta_1$ , and $\\theta_2$ , fulfilling certain conditions. Results known for the two-parameter CRP are carried over to this model. We study the number of customers at the cocktail bar, the number of customers at each table, and the number of occupied tables after n customers have entered the restaurant. For $\\alpha>0$ the number of occupied tables, properly scaled, is asymptotically three-parameter Mittag–Leffler distributed as n tends to infinity. We provide representations for the two- and three-parameter Mittag–Leffler distribution leading to efficient random number generators for these distributions. The proofs draw heavily from methods known for exchangeable random partitions, martingale methods known for generalized Pólya urns, and results known for the two-parameter CRP.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter