Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
33 result(s) for "Sparse alternative"
Sort by:
Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures
Abstract-Combining individual p-values to aggregate multiple small effects has a long-standing interest in statistics, dating back to the classic Fisher's combination test. In modern large-scale data analysis, correlation and sparsity are common features and efficient computation is a necessary requirement for dealing with massive data. To overcome these challenges, we propose a new test that takes advantage of the Cauchy distribution. Our test statistic has a simple form and is defined as a weighted sum of Cauchy transformation of individual p-values. We prove a nonasymptotic result that the tail of the null distribution of our proposed test statistic can be well approximated by a Cauchy distribution under arbitrary dependency structures. Based on this theoretical result, the p-value calculation of our proposed test is not only accurate, but also as simple as the classic z-test or t-test, making our test well suited for analyzing massive data. We further show that the power of the proposed test is asymptotically optimal in a strong sparsity setting. Extensive simulations demonstrate that the proposed test has both strong power against sparse alternatives and a good accuracy with respect to p-value calculations, especially for very small p-values. The proposed test has also been applied to a genome-wide association study of Crohn's disease and compared with several existing tests. Supplementary materials for this article are available online.
POWER ENHANCEMENT IN HIGH-DIMENSIONAL CROSS-SECTIONAL TESTS
We propose a novel technique to boost the power of testing a high-dimensional vector H : θ = 0 against sparse alternatives where the null hypothesis is violated by only a few components. Existing tests based on quadratic forms such as the Wald statistic often suffer from low powers due to the accumulation of errors in estimating high-dimensional parameters. More powerful tests for sparse alternatives such as thresholding and extreme value tests, on the other hand, require either stringent conditions or bootstrap to derive the null distribution and often suffer from size distortions due to the slow convergence. Based on a screening technique, we introduce a \"power enhancement component,\" which is zero under the null hypothesis with high probability, but diverges quickly under sparse alternatives. The proposed test statistic combines the power enhancement component with an asymptotically pivotal statistic, and strengthens the power under sparse alternatives. The null distribution does not require stringent regularity conditions, and is completely determined by that of the pivotal statistic. The proposed methods are then applied to testing the factor pricing models and validating the cross-sectional independence in panel data models.
TEST FOR HIGH-DIMENSIONAL CORRELATION MATRICES
Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one , two and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. We systematically investigate the asymptotic null distribution, power function and unbiasedness of each test statistic. Theoretically, we make great efforts to deal with the nonindependency of all random matrices of the sample correlation matrices. We use simulation studies and real data analysis to illustrate the versatility and practicability of our test statistics.
Two-sample tests of high-dimensional means for compositional data
Compositional data are ubiquitous in many scientific endeavours. Motivated by microbiome and metagenomic research, we consider a two-sample testing problem for high-dimensional compositional data and formulate a testable hypothesis of compositional equivalence for the means of two latent log basis vectors. We propose a test through the centred log-ratio transformation of the compositions. The asymptotic null distribution of the test statistic is derived and its power against sparse alternatives is investigated. A modified test for paired samples is also considered. Simulations show that the proposed tests can be significantly more powerful than tests that are applied to the raw and log-transformed compositions. The usefulness of our tests is illustrated by applications to gut microbiome composition in obesity and Crohn’s disease.
Weighted Statistic in Detecting Faint and Sparse Alternatives for High-Dimensional Covariance Matrices
This article considers testing equality of two population covariance matrices when the data dimension p diverges with the sample size n (p/n → c > 0). We propose a weighted test statistic that is data-driven and powerful in both faint alternatives (many small disturbances) and sparse alternatives (several large disturbances). Its asymptotic null distribution is derived by large random matrix theory without assuming the existence of a limiting cumulative distribution function of the population covariance matrix. The simulation results confirm that our statistic is powerful against all alternatives, while other tests given in the literature fail in at least one situation. Supplementary materials for this article are available online.
HYPOTHESIS TESTING FOR BLOCK-STRUCTURED CORRELATION FOR HIGH-DIMENSIONAL VARIABLES
Testing the independence or block independence of high-dimensional random vectors is important in multivariate statistical analysis. Recent works on high-dimensional block-independence tests aim to extend their validity beyond specific distributions (e.g., Gaussian) or restrictive block sizes. In this paper, we propose a new and powerful test for the block-structured correlation of high-dimensional random vectors, for sparse or nonsparse alternatives, without strict distributional assumptions. The statistifical properties of the proposed test are developed under the asymptotic regime that the dimension grows proportionally with the sample size. Empirically, we find that the proposed test outperforms existing tests for a variety of alternatives, and works quite well when there are few existing tests at our disposal.
HIGH-DIMENSIONAL TWO-SAMPLE COVARIANCE MATRIX TESTING VIA SUPER-DIAGONALS
This paper considers testing for two-sample covariance matrices of high-dimensional populations. We formulate a multiple test procedure by comparing the super-diagonals of the covariance matrices. The asymptotic distributions of the test statistics are derived and the powers of individual tests are studied. The test statistics, by focusing on the super-diagonals, have smaller variation than the existing tests that target on the entire covariance matrix. The advantage of the proposed test is demonstrated by simulation studies, as well as an empirical study on a prostate cancer dataset.
On set-based association tests
Motivated by, but not limited to, association analyses of multiple genetic variants, we propose here a summary statistics-based regression framework. The proposed method requires only variant-specific summary statistics, and it unifies earlier methods based on individual-level data as special cases. The resulting score test statistic, derived from a linear mixed-effect regression model, inherently transforms the variant-specific statistics using the precision matrix to improve power for detecting sparse alternatives. Furthermore, the proposed method can incorporate additional variant-specific information with ease, facilitating omic-data integration. We study the asymptotic properties of the proposed tests under the null and alternatives, and we investigate efficient P-value calculation in finite samples. Finally, we provide supporting empirical evidence from extensive simulation studies and two applications. Motivées notamment par les analyses d’association entre plusieurs variantes génétiques, les auteures proposent un cadre de régression basé sur les statistiques sommaires. La méthode proposée ne requiert que des statistiques sommaires pour chacune des variantes, et elle unifie les méthodes précédentes basées sur les données individuelles comme cas particuliers. La statistique score de test qui en résulte est dérivée d’un modèle de régression linéaire à effets mixtes et transforme les statistiques spécifiques aux variantes à l’aide d’une matrice de précision pour améliorer la puissance de détection des alternatives éparses. La méthode proposée peut également incorporer aisément des informations additionnelles spécifiques aux variantes, ce qui facilite l’intégration des données omiques. Les auteures étudient les propriétés asymptotiques des tests proposés sous les hypothèses nulle et alternative, et elles investiguent le calcul efficace de p-values pour les échantillons finis. Elles illustrent et supportent leur méthode empiriquement par de vastes études de simulation et deux applications.
A maximum-type microbial differential abundance test with application to high-dimensional microbiome data analyses
Background: High-throughput metagenomic sequencing technologies have shown prominent advantages over traditional pathogen detection methods, bringing great potential in clinical pathogen diagnosis and treatment of infectious diseases. Nevertheless, how to accurately detect the difference in microbiome profiles between treatment or disease conditions remains computationally challenging.Results: In this study, we propose a novel test for identifying the difference between two high-dimensional microbiome abundance data matrices based on the centered log-ratio transformation of the microbiome compositions. The test p-value can be calculated directly with a closed-form solution from the derived asymptotic null distribution. We also investigate the asymptotic statistical power against sparse alternatives that are typically encountered in microbiome studies. The proposed test is maximum-type equal-covariance-assumption-free (MECAF), making it widely applicable to studies that compare microbiome compositions between conditions. Our simulation studies demonstrated that the proposed MECAF test achieves more desirable power than competing methods while having the type I error rate well controlled under various scenarios. The usefulness of the proposed test is further illustrated with two real microbiome data analyses. The source code of the proposed method is freely available at https://github.com/Jiyuan-NYU-Langone/MECAF.Conclusions: MECAF is a flexible differential abundance test and achieves statistical efficiency in analyzing high-throughput microbiome data. The proposed new method will allow us to efficiently discover shifts in microbiome abundances between disease and treatment conditions, broadening our understanding of the disease and ultimately improving clinical diagnosis and treatment.
Image Restoration via Simultaneous Sparse Coding: Where Structured Sparsity Meets Gaussian Scale Mixture
In image processing, sparse coding has been known to be relevant to both variational and Bayesian approaches. The regularization parameter in variational image restoration is intrinsically connected with the shape parameter of sparse coefficients’ distribution in Bayesian methods. How to set those parameters in a principled yet spatially adaptive fashion turns out to be a challenging problem especially for the class of nonlocal image models. In this work, we propose a structured sparse coding framework to address this issue—more specifically, a nonlocal extension of Gaussian scale mixture (GSM) model is developed using simultaneous sparse coding (SSC) and its applications into image restoration are explored. It is shown that the variances of sparse coefficients (the field of scalar multipliers of Gaussians)—if treated as a latent variable—can be jointly estimated along with the unknown sparse coefficients via the method of alternating optimization. When applied to image restoration, our experimental results have shown that the proposed SSC–GSM technique can both preserve the sharpness of edges and suppress undesirable artifacts. Thanks to its capability of achieving a better spatial adaptation, SSC–GSM based image restoration often delivers reconstructed images with higher subjective/objective qualities than other competing approaches.