Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
18,363
result(s) for
"covariance matrices"
Sort by:
Large covariance estimation by thresholding principal orthogonal complements
by
Fan, Jianqing
,
Mincheva, Martina
,
Liao, Yuan
in
Analysis of covariance
,
Approximate factor model
,
Approximation
2013
The paper deals with the estimation of a high dimensional covariance with a conditional sparsity structure and fast diverging eigenvalues. By assuming a sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the principal orthogonal complement thresholding method 'POET' to explore such an approximate factor structure with sparsity. The POET-estimator includes the sample covariance matrix, the factor-based covariance matrix, the thresholding estimator and the adaptive thresholding estimator as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the effect of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.
Journal Article
ESTIMATING SPARSE PRECISION MATRIX: OPTIMAL RATES OF CONVERGENCE AND ADAPTIVE ESTIMATION
2016
Precision matrix is of significant importance in a wide range of applications in multivariate analysis. This paper considers adaptive minimax estimation of sparse precision matrices in the high dimensional setting. Optimal rates of convergence are established for a range of matrix norm losses. A fully data driven estimator based on adaptive constrained ℓ₁ minimization is proposed and its rate of convergence is obtained over a collection of parameter spaces. The estimator, called ACLIME, is easy to implement and performs well numerically. A major step in establishing the minimax rate of convergence is the derivation of a rate-sharp lower bound. A \"two-directional\" lower bound technique is applied to obtain the minimax lower bound. The upper and lower bounds together yield the optimal rates of convergence for sparse precision matrix estimation and show that the ACLIME estimator is adaptively minimax rate optimal for a collection of parameter spaces and a range of matrix norm losses simultaneously.
Journal Article
A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation
by
Cai, Tony
,
Liu, Weidong
,
Luo, Xi
in
Acceleration of convergence
,
Analytical estimating
,
Applications
2011
This article proposes a constrained ℓ
1
minimization method for estimating a sparse inverse covariance matrix based on a sample of n iid p-variate random variables. The resulting estimator is shown to have a number of desirable properties. In particular, the rate of convergence between the estimator and the true s-sparse precision matrix under the spectral norm is
when the population distribution has either exponential-type tails or polynomial-type tails. We present convergence rates under the elementwise ℓ
∞
norm and Frobenius norm. In addition, we consider graphical model selection. The procedure is easily implemented by linear programming. Numerical performance of the estimator is investigated using both simulated and real data. In particular, the procedure is applied to analyze a breast cancer dataset and is found to perform favorably compared with existing methods.
Journal Article
LIMITING LAWS FOR DIVERGENT SPIKED EIGENVALUES AND LARGEST NONSPIKED EIGENVALUE OF SAMPLE COVARIANCE MATRICES
by
Han, Xiao
,
Pan, Guangming
,
Cai, T. Tony
in
Asymptotic methods
,
Asymptotic properties
,
Constraining
2020
We study the asymptotic distributions of the spiked eigenvalues and the largest nonspiked eigenvalue of the sample covariance matrix under a general covariance model with divergent spiked eigenvalues, while the other eigenvalues are bounded but otherwise arbitrary. The limiting normal distribution for the spiked sample eigenvalues is established. It has distinct features that the asymptotic mean relies on not only the population spikes but also the nonspikes and that the asymptotic variance in general depends on the population eigenvectors. In addition, the limiting Tracy–Widom law for the largest nonspiked sample eigenvalue is obtained.
Estimation of the number of spikes and the convergence of the leading eigenvectors are also considered. The results hold even when the number of the spikes diverges. As a key technical tool, we develop a central limit theorem for a type of random quadratic forms where the random vectors and random matrices involved are dependent. This result can be of independent interest.
Journal Article
On the meta-analysis of response ratios for studies with correlated and multi-group designs
by
Lajeunesse, Marc J.
in
Animal and plant ecology
,
Animal, plant and microbial ecology
,
Biological and medical sciences
2011
A common effect size metric used to quantify the outcome of experiments for ecological meta-analysis is the response ratio (RR): the log proportional change in the means of a treatment and control group. Estimates of the variance of RR are also important for meta-analysis because they serve as weights when effect sizes are averaged and compared. The variance of an effect size is typically a function of sampling error; however, it can also be influenced by study design. Here, I derive new variances and covariances for RR for several often-encountered experimental designs: when the treatment and control means are correlated; when multiple treatments have a common control; when means are based on repeated measures; and when the study has a correlated factorial design, or is multivariate. These developments are useful for improving the quality of data extracted from studies for meta-analysis and help address some of the common challenges meta-analysts face when quantifying a diversity of experimental designs with the response ratio.
Journal Article
SPIKED SEPARABLE COVARIANCE MATRICES AND PRINCIPAL COMPONENTS
2021
We study a class of separable sample covariance matrices of the form 𝒬̃1 := Ã
1/2
X B̃ X* Ã
1/2. Here, Ã and B̃ are positive definite matrices whose spectrums consist of bulk spectrums plus several spikes, that is, larger eigenvalues that are separated from the bulks. Conceptually, we call 𝒬̃1 a spiked separable covariance matrix model. On the one hand, this model includes the spiked covariance matrix as a special case with B̃ = I. On the other hand, it allows for more general correlations of datasets. In particular, for spatio-temporal dataset, Ã and B̃ represent the spatial and temporal correlations, respectively.
In this paper, we study the outlier eigenvalues and eigenvectors, that is, the principal components, of the spiked separable covariance model 𝒬̃1. We prove the convergence of the outlier eigenvalues λ̃i
and the generalized components (i.e., 〈v, ξ̃i
〉 for any deterministic vector v) of the outlier eigenvectors ξ̃i
with optimal convergence rates. Moreover, we also prove the delocalization of the nonoutlier eigenvectors. We state our results in full generality, in the sense that they also hold near the so-called BBP transition and for degenerate outliers. Our results highlight both the similarity and difference between the spiked separable covariance matrix model and the spiked covariance matrix model in (Probab. Theory Related Fields 164 (2016) 459–552). In particular, we show that the spikes of both à and B̃ will cause outliers of the eigenvalue spectrum, and the eigenvectors can help to select the outliers that correspond to the spikes of à (or B̃).
Journal Article
Information-Based Optimal Subdata Selection for Big Data Linear Regression
by
Yang, Min
,
Stufken, John
,
Wang, HaiYing
in
Analysis of covariance
,
Big Data
,
Computer simulation
2019
Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large datasets due to computational limitations. A critical step in big data analysis is data reduction. Existing investigations in the context of linear regression focus on subsampling-based methods. However, not only is this approach prone to sampling errors, it also leads to a covariance matrix of the estimators that is typically bounded from below by a term that is of the order of the inverse of the subdata size. We propose a novel approach, termed information-based optimal subdata selection (IBOSS). Compared to leading existing subdata methods, the IBOSS approach has the following advantages: (i) it is significantly faster; (ii) it is suitable for distributed parallel computing; (iii) the variances of the slope parameter estimators converge to 0 as the full data size increases even if the subdata size is fixed, that is, the convergence rate depends on the full data size; (iv) data analysis for IBOSS subdata is straightforward and the sampling distribution of an IBOSS estimator is easy to assess. Theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to subsampling-based methods, sometimes by orders of magnitude. The advantages of the new approach are also illustrated through analysis of real data. Supplementary materials for this article are available online.
Journal Article
POWER ENHANCEMENT IN HIGH-DIMENSIONAL CROSS-SECTIONAL TESTS
2015
We propose a novel technique to boost the power of testing a high-dimensional vector H : θ = 0 against sparse alternatives where the null hypothesis is violated by only a few components. Existing tests based on quadratic forms such as the Wald statistic often suffer from low powers due to the accumulation of errors in estimating high-dimensional parameters. More powerful tests for sparse alternatives such as thresholding and extreme value tests, on the other hand, require either stringent conditions or bootstrap to derive the null distribution and often suffer from size distortions due to the slow convergence. Based on a screening technique, we introduce a \"power enhancement component,\" which is zero under the null hypothesis with high probability, but diverges quickly under sparse alternatives. The proposed test statistic combines the power enhancement component with an asymptotically pivotal statistic, and strengthens the power under sparse alternatives. The null distribution does not require stringent regularity conditions, and is completely determined by that of the pivotal statistic. The proposed methods are then applied to testing the factor pricing models and validating the cross-sectional independence in panel data models.
Journal Article
SPECTRUM ESTIMATION FROM SAMPLES
2017
We consider the problem of approximating the set of eigenvalues of the covariance matrix of a multivariate distribution (equivalently, the problem of approximating the \"population spectrum\"), given access to samples drawn from the distribution. We consider this recovery problem in the regime where the sample size is comparable to, or even sublinear in the dimensionality of the distribution. First, we propose a theoretically optimal and computationally efficient algorithm for recovering the moments of the eigenvalues of the population covariance matrix. We then leverage this accurate moment recovery, via a Wasserstein distance argument, to accurately reconstruct the vector of eigenvalues. Together, this yields an eigenvalue reconstruction algorithm that is asymptotically consistent as the dimensionality of the distribution and sample size tend toward infinity, even in the sublinear sample regime where the ratio of the sample size to the dimensionality tends to zero. In addition to our theoretical results, we show that our approach performs well in practice for a broad range of distributions and sample sizes.
Journal Article
SPARSE PCA: OPTIMAL RATES AND ADAPTIVE ESTIMATION
2013
Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano's lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.
Journal Article