Catalogue Search | MBRL

The Partial Linear Model in High Dimensions

by Müller, Patric , van de Geer, Sara in doubly penalized Lasso , high-dimensional partial linear model , Lasso

2015

Partial linear models have been widely used as flexible method for modelling linear components in conjunction with non-parametric ones. Despite the presence of the non-parametric part, the linear, parametric part can under certain conditions be estimated with parametric rate. In this paper, we consider a high-dimensional linear part. We show that it can be estimated with oracle rates, using the least absolute shrinkage and selection operator penalty for the linear part and a smoothness penalty for the nonparametric part.

Journal Article

Share this book

Add to My Shelf

PARTIALLY LINEAR ADDITIVE QUANTILE REGRESSION IN ULTRA-HIGH DIMENSION

by Wang, Lan , Sherwood, Ben in 62G20 , 62G35 , Birth weight

2016

We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.

Journal Article

Share this book

Add to My Shelf

Drawing inferences for high-dimensional linear models: A selection-assisted partial regression and smoothing approach

by Banerjee, Moulinath , Fei, Zhe , Zhu, Ji in Asymptotic properties , Biological effects , biometry

2019

Drawing inferences for high-dimensional models is challenging as regular asymptotic theories are not applicable. This article proposes a new framework of simultaneous estimation and inferences for high-dimensional linear models. By smoothing over partial regression estimates based on a given variable selection scheme, we reduce the problem to low-dimensional least squares estimations. The procedure, termed as Selection-assisted Partial Regression and Smoothing (SPARES), utilizes data splitting along with variable selection and partial regression. We show that the SPARES estimator is asymptotically unbiased and normal, and derive its variance via a nonparametric delta method. The utility of the procedure is evaluated under various simulation scenarios and via comparisons with the de-biased LASSO estimators, a major competitor. We apply the method to analyze two genomic datasets and obtain biologically meaningful results.

Journal Article

Share this book

Add to My Shelf

Property of intrinsic drift coefficients in globally-evolving-based generalized density evolution equation for the first-passage reliability assessment

by Chen, Jianbing , Sun, Tingting , Lyu, Mengze in Boundary conditions , Civil engineering , Classical and Continuum Physics

2023

The recently developed globally-evolving-based generalized density evolution equation (GE-GDEE) provides a promising tool to obtain the instantaneous probability density of any quantity of interest of stochastically excited high-dimensional systems. By introducing an absorbing boundary process (ABP) determined by failure criteria, a corresponding GE-GDEE can be constructed and solved to assess the first-passage reliability. The construction of the intrinsic drift coefficient of the GE-GDEE is the crucial step. In the present paper, the property of the intrinsic drift coefficients of the GE-GDEE for ABPs is studied. In particular, the intrinsic drift coefficients of some linear systems are exactly constructed via analytical solutions. Compared to the GE-GDEE of the original quantity of interest, the intrinsic drift coefficients of GE-GDEE of the corresponding ABP in both linear and nonlinear systems change considerably in the vicinity of the boundary, curving towards zero, but vary very little far away from the boundary. This physically means that the apparent damping is reduced, thus leading to an unconservative estimate of failure probability if the intrinsic drift coefficients of the original quantity of interest rather than those of the ABP are adopted. Interestingly, the failure probability by solving the GE-GDEE of the original quantity of interest is in the same order of magnitude as the true value and thus can be an acceptable approximate result, particularly for low failure probability estimate problems. The findings in the paper provide insightful guidance on constructing the intrinsic drift coefficients of the GE-GDEE of ABPs for the first-passage reliability evaluation of high-dimensional nonlinear systems.

Journal Article

Share this book

Add to My Shelf

Testing covariates in high-dimensional regression

by Lan, Wei , Wang, Hansheng , Tsai, Chih-Ling in Advertising , Analysis , Asymptotic properties

2014

In a high-dimensional linear regression model, we propose a new procedure for testing statistical significance of a subset of regression coefficients. Specifically, we employ the partial covariances between the response variable and the tested covariates to obtain a test statistic. The resulting test is applicable even if the predictor dimension is much larger than the sample size. Under the null hypothesis, together with boundedness and moment conditions on the predictors, we show that the proposed test statistic is asymptotically standard normal, which is further supported by Monte Carlo experiments. A similar test can be extended to generalized linear models. The practical usefulness of the test is illustrated via an empirical example on paid search advertising.

Journal Article

Share this book

Add to My Shelf

Partial least squares Cox regression for genome-wide data

by Nygård, Ståle , Lingjærde, Ole Christian , Borgan, Ørnulf in Algorithms , Breast Neoplasms - genetics , Computer Simulation

2008

Most methods for survival prediction from high-dimensional genomic data combine the Cox proportional hazards model with some technique of dimension reduction, such as partial least squares regression (PLS). Applying PLS to the Cox model is not entirely straightforward, and multiple approaches have been proposed. The method of Park et al. (Bioinformatics 18(Suppl. 1):S120–S127, 2002) uses a reformulation of the Cox likelihood to a Poisson type likelihood, thereby enabling estimation by iteratively reweighted partial least squares for generalized linear models. We propose a modification of the method of park et al. (2002) such that estimates of the baseline hazard and the gene effects are obtained in separate steps. The resulting method has several advantages over the method of park et al. (2002) and other existing Cox PLS approaches, as it allows for estimation of survival probabilities for new patients, enables a less memory-demanding estimation procedure, and allows for incorporation of lower-dimensional non-genomic variables like disease grade and tumor thickness. We also propose to combine our Cox PLS method with an initial gene selection step in which genes are ordered by their Cox score and only the highest-ranking k% of the genes are retained, obtaining a so-called supervised partial least squares regression method. In simulations, both the unsupervised and the supervised version outperform other Cox PLS methods.

Journal Article

Share this book

Add to My Shelf

Comparison of PLS algorithms when number of objects is much larger than number of variables

by Alin, Aylin in Algorithms , Bootstrap method , Central processing units

2009

NIPALS and SIMPLS algorithms are the most commonly used algorithms for partial least squares analysis. When the number of objects, N , is much larger than the number of explanatory, K , and/or response variables, M , the NIPALS algorithm can be time consuming. Even though the SIMPLS is not as time consuming as the NIPALS and can be preferred over the NIPALS, there are kernel algorithms developed especially for the cases where N is much larger than number of variables. In this study, the NIPALS, SIMPLS and some kernel algorithms have been used to built partial least squares regression model. Their performances have been compared in terms of the total CPU time spent for the calculations of latent variables, leave-one-out cross validation and bootstrap methods. According to the numerical results, one of the kernel algorithms suggested by Dayal and MacGregor (J Chemom 11:73–85, 1997) is the fastest algorithm.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter