Catalogue Search | MBRL

Multi-label dimensionality reduction

by Sun, Liang, author , Ji, Shuiwang, 1977- author , Ye, Jieping, author in Dimension reduction (Statistics) , Machine learning. , Canonical correlation (Statistics)

Book

Share this book

Add to My Shelf

Robust Methods for Data Reduction

by Farcomeni, Alessio , Greco, Luca in Computer programs , Data reduction , Dimension reduction (Statistics)

2016,2015

This book gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, double clustering, and discriminant analysis. Using real examples, the authors show how to implement the procedures in R. The code and data for the examples are available on the book's CRC Press web page.

eBook

Share this book

Add to My Shelf

A GENERAL THEORY FOR NONLINEAR SUFFICIENT DIMENSION REDUCTION: FORMULATION AND ESTIMATION

by Chiaromonte, Francesca , Li, Bing , Lee, Kuang-Yao in 62B05 , 62G08 , 62H30

2013

In this paper we introduce a general theory for nonlinear sufficient dimension reduction, and explore its ramifications and scope. This theory subsumes recent work employing reproducing kernel Hilbert spaces, and reveals many parallels between linear and nonlinear sufficient dimension reduction. Using these parallels we analyze the properties of existing methods and develop new ones. We begin by characterizing dimension reduction at the general level of σ-fields and proceed to that of classes of functions, leading to the notions of sufficient, complete and central dimension reduction classes. We show that, when it exists, the complete and sufficient class coincides with the central class, and can be unbiasedly and exhaustively estimated by a generalized sliced inverse regression estimator (GSIR). When completeness does not hold, this estimator captures only part of the central class. However, in these cases we show that a generalized sliced average variance estimator (GSAVE) can capture a larger portion of the class. Both estimators require no numerical optimization because they can be computed by spectral decomposition of linear operators. Finally, we compare our estimators with existing methods by simulation and on actual data sets.

Journal Article

Share this book

Add to My Shelf

A Review on Dimension Reduction

by Zhu, Liping , Ma, Yanyuan in Backbone , Dimension reduction , double robustness

2013

Summarizing the effect of many covariates through a few linear combinations is an effective way of reducing covariate dimension and is the backbone of (sufficient) dimension reduction. Because the replacement of high-dimensional covariates by low-dimensional linear combinations is performed with a minimum assumption on the specific regression form, it enjoys attractive advantages as well as encounters unique challenges in comparison with the variable selection approach. We review the current literature of dimension reduction with an emphasis on the two most popular models, where the dimension reduction affects the conditional distribution and the conditional mean, respectively. We discuss various estimation and inference procedures in different levels of detail, with the intention of focusing on their underneath idea instead of technicalities. We also discuss some unsolved problems in this area for potential future research. Résumer l'impact d'un nombre élevé de variables explicatives à celui d'un nombre réduit de combinaisons linéaires bien choisies constitue une façon efficace de réduire la dimension d'un problème. Cette réduction à un petit nombre de combinaisons linéaires est réalisée à partir d'hypothèses minimales sur la forme de la dépendance et jouit, par rapport à une approche fondée sur la sélection de variables, de propriétés particulièrement attractives. Nous passons en revue la littérature existante sur ce sujet, en mettant l'accent sur les modèles les plus courants, dans lesquels la réduction de dimension affecte l'espérance conditionnelle de la réponse ou sa loi conditionnelle tout entière. Nous discutons plusieurs méthodes d'estimation et d'inference, en insistant sur les idées sous-jacentes plutôt que sur les aspects techniques. Nous présentons également quelques problèmes non résolus et thèmes de recherches futures dans ce domaine.

Journal Article

Share this book

Add to My Shelf

Slice weighted average regression

by Davies, Joshua , Masioti, Marina , Shaker, Amanda

2023

It has previously been shown that ordinary least squares can be used to estimate the coefficients of the single-index model under only mild conditions. However, the estimator is non-robust leading to poor estimates for some models. In this paper we propose a new sliced least-squares estimator that utilizes ideas from Sliced Inverse Regression. Slices with problematic observations that contribute to high variability in the estimator can easily be down-weighted to robustify the procedure. The estimator is simple to implement and can result in vast improvements for some models when compared to the usual least-squares approach. While the estimator was initially conceived with the single-index model in mind, we also show that multiple directions can be obtained, therefore providing another notable advantage of using slicing with least squares. Several simulation studies and a real data example are included, as well as some comparisons with some other recent methods.

Journal Article

Share this book

Add to My Shelf

Multiple phenotype association tests based on sliced inverse regression

by Zhu, Wensheng , Sun, Wenyuan , Jon, Kyongson in Algorithms , Bioinformatics , Biomedical and Life Sciences

2024

Background Joint analysis of multiple phenotypes in studies of biological systems such as Genome-Wide Association Studies is critical to revealing the functional interactions between various traits and genetic variants, but growth of data in dimensionality has become a very challenging problem in the widespread use of joint analysis. To handle the excessiveness of variables, we consider the sliced inverse regression (SIR) method. Specifically, we propose a novel SIR-based association test that is robust and powerful in testing the association between multiple predictors and multiple outcomes. Results We conduct simulation studies in both low- and high-dimensional settings with various numbers of Single-Nucleotide Polymorphisms and consider the correlation structure of traits. Simulation results show that the proposed method outperforms the existing methods. We also successfully apply our method to the genetic association study of ADNI dataset. Both the simulation studies and real data analysis show that the SIR-based association test is valid and achieves a higher efficiency compared with its competitors. Conclusion Several scenarios with low- and high-dimensional responses and genotypes are considered in this paper. Our SIR-based method controls the estimated type I error at the pre-specified level α .

Journal Article

Share this book

Add to My Shelf

Randomized CP tensor decomposition

by Erichson, N Benjamin , Manohar, Krithika , Kutz, J Nathan in Algorithms , Decomposition , Mathematical analysis

2020

The CANDECOMP/PARAFAC (CP) tensor decomposition is a popular dimensionality-reduction method for multiway data. Dimensionality reduction is often sought after since many high-dimensional tensors have low intrinsic rank relative to the dimension of the ambient measurement space. However, the emergence of 'big data' poses significant computational challenges for computing this fundamental tensor decomposition. By leveraging modern randomized algorithms, we demonstrate that coherent structures can be learned from a smaller representation of the tensor in a fraction of the time. Thus, this simple but powerful algorithm enables one to compute the approximate CP decomposition even for massive tensors. The approximation error can thereby be controlled via oversampling and the computation of power iterations. In addition to theoretical results, several empirical results demonstrate the performance of the proposed algorithm.

Journal Article

Share this book

Add to My Shelf

MPP-based approximated DRM (ADRM) using simplified bivariate approximation with linear regression

by Cho, Hyunkyoo , Lee, Ikjin , Jung, Yongsu in Accuracy , Approximation , Bivariate analysis

2019

Conventional most probable point (MPP)-based dimension reduction methods (DRM) show better accuracy in reliability analysis than first-order reliability methods (FORM), and thus have been successfully applied to reliability-based design optimization (RBDO). However, the MPP-based DRM requires additional function evaluations to improve accuracy of probability of failure estimation which could be computationally expensive, and thus leads to reduction in efficiency. Therefore, in this paper, we propose an MPP-based approximated DRM (ADRM) that performs one more approximation at MPP to pursue accuracy of DRM while maintaining efficiency of FORM. In the proposed method, performance functions will be approximated in the original X-space with simplified bivariate DRM and linear regression using available function information such as gradients obtained during previous MPP searches. Therefore, evaluation of quadrature points can be replaced by the proposed approximation. In this manner, we eliminate function evaluations at quadrature points for reliability analysis, so that the proposed method requires function evaluations for MPP search only, which is identical with FORM. In RBDO where sequential reliability analyses in different design points are necessary, ADRM becomes more powerful due to accumulated function information, which will lead to more accurate approximation. To further improve efficiency of the proposed method, several techniques, such as local window and adaptive initial point, are proposed as well. Numerical study verifies that the proposed method is as accurate as DRM and as efficient as FORM by utilizing available function information obtained during MPP searches.

Journal Article

Share this book

Add to My Shelf

Prediction analysis for microbiome sequencing data

by Zhao, Hongyu , Yang, Can , Wang, Tao in Algorithms , Bacteria - genetics , BIOMETRIC METHODOLOGY

2019

One goal of human microbiome studies is to relate host traits with human microbiome compositions. The analysis of microbial community sequencing data presents great statistical challenges, especially when the samples have different library sizes and the data are overdispersed with many zeros. To address these challenges, we introduce a new statistical framework, called predictive analysis in metagenomics via inverse regression (PAMIR), to analyze microbiome sequencing data. Within this framework, an inverse regression model is developed for overdispersed microbiota counts given the trait, and then a prediction rule is constructed by taking advantage of the dimension-reduction structure in the model. An efficient Monte Carlo expectation-maximization algorithm is proposed for maximum likelihood estimation. The method is further generalized to accommodate other types of covariates. We demonstrate the advantages of PAMIR through simulations and two real data examples.

Journal Article

Share this book

Add to My Shelf

Aggregate Kernel Inverse Regression Estimation

by Li, Wenjuan , Chen, Jingsi , Wang, Wenying in aggregate dimension reduction , aggregate kernel inverse regression estimation , Analysis

2023

Sufficient dimension reduction (SDR) is a useful tool for nonparametric regression with high-dimensional predictors. Many existing SDR methods rely on some assumptions about the distribution of predictors. Wang et al. proposed an aggregate dimension reduction method to reduce the dependence on the distributional assumptions. Motivated by their work, we propose a novel and effective method by combining the aggregate method and the kernel inverse regression estimation. The proposed approach can accurately estimate the dimension reduction directions and substantially improve the exhaustivity of the estimates with complex models. At the same time, this method does not depend on the arrangement of slices, and the influence of the extreme values of the response is reduced. In numerical examples and a real data application, it performs well.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter