Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
79
result(s) for
"62E17"
Sort by:
GAUSSIAN APPROXIMATIONS AND MULTIPLIER BOOTSTRAP FOR MAXIMA OF SUMS OF HIGH-DIMENSIONAL RANDOM VECTORS
2013
We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors (p) is large compared to the sample size (n); in fact, p can be much larger than n, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, p can be large or even much larger than n. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern highdimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain nonasymptotic bounds on approximation errors.
Journal Article
GAUSSIAN APPROXIMATION OF SUPREMA OF EMPIRICAL PROCESSES
2014
This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably, the bound in the main approximation theorem is nonasymptotic and the theorem allows for functions that index the empirical process to be unbounded and have entropy divergent with the sample size. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein's method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local and series empirical processes arising in nonparametric estimation via kernel and series methods, where the classes of functions change with the sample size and are non-Donsker. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.
Journal Article
Optimal cross-validation in density estimation with the$L^{2}$ -loss
2014
We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-p-out CV procedure (Lpo), where p denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon V-fold cross-validation in terms of variability and computational complexity.
¶
From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with p=1, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size n, optimality is achieved for p large enough [with p/n=o(1)] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as p/n is conveniently related to the rate of convergence of the best estimator in the collection: (i) p/n\\to1 as n\\to+\\infty with a parametric rate, and (ii) p/n=o(1) with some nonparametric estimators. These theoretical results are validated by simulation experiments.
Journal Article
BOOTSTRAP CONFIDENCE SETS UNDER MODEL MISSPECIFICATION
2015
A multiplier bootstrap procedure for construction of likelihood-based confidence sets is considered for finite samples and a possible model misspecification. Theoretical results justify the bootstrap validity for a small or moderate sample size and allow to control the impact of the parameter dimension p: the bootstrap approximation works if p³/n is small. The main result about bootstrap validity continues to apply even if the underlying parametric model is misspecified under the so-called small modelling bias condition. In the case when the true model deviates significantly from the considered parametric family, the bootstrap procedure is still applicable but it becomes a bit conservative: the size of the constructed confidence sets is increased by the modelling bias. We illustrate the results with numerical examples for misspecified linear and logistic regressions.
Journal Article
Finite Sample Approximation Results for Principal Component Analysis: A Matrix Perturbation Approach
2008
Principal component analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, and those of the limiting population PCA as n → ∞. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit p, n → ∞, with p/n = c. We present a matrix perturbation view of the \"phase transition phenomenon,\" and a simple linear-algebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite p, n where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size n, the eigenvector of sample PCA may exhibit a sharp \"loss of tracking,\" suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
Journal Article
Probabilistic zero bounds of certain random polynomials
2024
This paper introduces the notion of probabilistic zero bounds for random polynomials. It presents new results regarding the probabilistic bounds of random polynomials whose coefficients are independently and identically distributed as standard normal variates. Additionally, the paper provides a clear exposition of the developed methodology. To establish our results, we develop a novel approach utilizing the classical Cauchy’s bounds for the zeros of a deterministic polynomial with complex coefficients. We also corroborate our analytical results with extensive simulations. The methodology developed in the paper can potentially be applied to a broad class of problems regarding bounds and the distribution of zeros in the theory of random polynomials.
Journal Article
Another Look at Stein’s Method for Studentized Nonlinear Statistics with an Application to U-Statistics
by
Zhang, Liqian
,
Leung, Dennis
,
Shao, Qi-Man
in
Inequality
,
Mathematics
,
Mathematics and Statistics
2024
We take another look at using Stein’s method to establish uniform Berry–Esseen bounds for Studentized nonlinear statistics, highlighting variable censoring and an exponential randomized concentration inequality for a sum of censored variables as the essential tools to carry out the arguments involved. As an important application, we prove a uniform Berry–Esseen bound for Studentized U-statistics in a form that exhibits the dependence on the degree of the kernel.
Journal Article
How Many Digits are Needed?
2024
Let X1,X2,... be the digits in the base-q expansion of a random variable X defined on [0, 1) where q≥2 is an integer. For n=1,2,..., we study the probability distribution Pn of the (scaled) remainder Tn(X)=∑k=n+1∞Xkqn-k: If X has an absolutely continuous CDF then Pn converges in the total variation metric to the Lebesgue measure μ on the unit interval. Under weak smoothness conditions we establish first a coupling between X and a non-negative integer valued random variable N so that TN(X) follows μ and is independent of (X1,...,XN), and second exponentially fast convergence of Pn and its PDF fn. We discuss how many digits are needed and show examples of our results.
Journal Article
Statistical inference and data analysis of the record-based transmuted Burr X model
2025
Probability distribution has proven its usefulness in almost every discipline of human endeavors. A novel extension of Bur X distribution is developed in this study employing the record-based transmuted mapping technique, which can be used to fit skewed and complex data. We referred to this novel distribution as a record-based transmuted Burr X model. We established the shape of the probability density function and hazard function. Numerous statistical and mathematical properties are provided, including quantile function, moment, and ordered statistics of the proposed model. Further, we obtain the estimation of the model parameters using the maximum likelihood estimation method, and four sets of Monte Carlo simulation studies are carried out to evaluate the efficiency of these estimates. Finally, the practical applicability of the developed model is demonstrated by analyzing three data sets, comparing its performance with several well-known distributions. The results highlight the flexibility and accuracy of the model, establishing it as a powerful and reliable tool for advanced statistical modeling in environmental and survival research.
Journal Article
Improved Bounds in Stein’s Method for Functions of Multivariate Normal Random Vectors
2024
In a recent paper, Gaunt (Ann I H Poincare Probab Stat 56:1484–1513, 2020) extended Stein’s method to limit distributions that can be represented as a function
g
:
R
d
→
R
of a centred multivariate normal random vector
Σ
1
/
2
Z
with
Z
a standard
d
-dimensional multivariate normal random vector and
Σ
a non-negative-definite covariance matrix. In this paper, we obtain improved bounds, in the sense of weaker moment conditions, smaller constants and simpler forms, for the case that
g
has derivatives with polynomial growth. We obtain new non-uniform bounds for the derivatives of the solution of the Stein equation and use these inequalities to obtain general bounds on the distance, measured using smooth test functions, between the distributions of
g
(
W
n
)
and
g
(
Z
)
, where
W
n
is a standardised sum of random vectors with independent components and
Z
is a standard
d
-dimensional multivariate normal random vector. We apply these general bounds to obtain bounds for the Chi-square approximation of the family of power divergence statistics (special cases include the Pearson and likelihood ratio statistics), for the case of two cell classifications, that improve on existing results in the literature.
Journal Article