Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
171
result(s) for
"Penalized Least Square"
Sort by:
CALIBRATING NONCONVEX PENALIZED REGRESSION IN ULTRA-HIGH DIMENSION
2013
We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not guaranteed to contain the oracle estimator; (2) even if a solution path is known to contain the oracle estimator, the optimal tuning parameter depends on many unknown factors and is hard to estimate. To address these two challenging issues, we first prove that an easy-to-calculate calibrated CCCP algorithm produces a consistent solution path which contains the oracle estimator with probability approaching one. Furthermore, we propose a high-dimensional BIC criterion and show that it can be applied to the solution path to select the optimal tuning parameter which asymptotically identifies the oracle estimator. The theory for a general class of nonconvex penalties in the ultra-high dimensional setup is established when the random errors follow the sub-Gaussian distribution. Monte Carlo studies confirm that the calibrated CCCP algorithm combined with the proposed high-dimensional BIC has desirable performance in identifying the underlying sparsity pattern for high-dimensional data analysis.
Journal Article
IDENTIFYING LATENT STRUCTURES IN PANEL DATA
by
Phillips, Peter C. B.
,
Shi, Zhentao
,
Su, Liangjun
in
Classification
,
Classifiers
,
cluster analysis
2016
This paper provides a novel mechanism for identifying and estimating latent group structures in panel data using penalized techniques. We consider both linear and nonlinear models where the regression coefficients are heterogeneous across groups but homogeneous within a group and the group membership is unknown. Two approaches are considered—penalized profile likelihood (PPL) estimation for the general nonlinear models without endogenous regressors, and penalized GMM (PGMM) estimation for linear models with endogeneity. In both cases, we develop a new variant of Lasso called classifier-Lasso (C-Lasso) that serves to shrink individual coefficients to the unknown group-specific coefficients. C-Lasso achieves simultaneous classification and consistent estimation in a single step and the classification exhibits the desirable property of uniform consistency. For PPL estimation, C-Lasso also achieves the oracle property so that group-specific parameter estimators are asymptotically equivalent to infeasible estimators that use individual group identity information. For PGMM estimation, the oracle property of C-Lasso is preserved in some special cases. Simulations demonstrate good finite-sample performance of the approach in both classification and estimation. Empirical applications to both linear and nonlinear models are presented.
Journal Article
A SELECTIVE OVERVIEW OF VARIABLE SELECTION IN HIGH DIMENSIONAL FEATURE SPACE
2010
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.
Journal Article
An Automatic Baseline Correction Method Based on the Penalized Least Squares Method
by
Tang, Xiaojun
,
Tong, Angxin
,
Wang, Jingwei
in
automated baseline correction
,
Decomposition
,
infrared spectra
2020
Baseline drift spectra are used for quantitative and qualitative analysis, which can easily lead to inaccurate or even wrong results. Although there are several baseline correction methods based on penalized least squares, they all have one or more parameters that must be optimized by users. For this purpose, an automatic baseline correction method based on penalized least squares is proposed in this paper. The algorithm first linearly expands the ends of the spectrum signal, and a Gaussian peak is added to the expanded range. Then, the whole spectrum is corrected by the adaptive smoothness parameter penalized least squares (asPLS) method, that is, by turning the smoothing parameter λ of asPLS to obtain a different root-mean-square error (RMSE) in the extended range, the optimal λ is selected with minimal RMSE. Finally, the baseline of the original signal is well estimated by asPLS with the optimal λ. The paper concludes with the experimental results on the simulated spectra and measured infrared spectra, demonstrating that the proposed method can automatically deal with different types of baseline drift.
Journal Article
Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection
2012
The reduced-rank regression is an effective method in predicting multiple response variables from the same set of predictor variables. It reduces the number of model parameters and takes advantage of interrelations between the response variables and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group and show that this penalty satisfies certain desirable invariance properties. We develop two numerical algorithms to solve the penalized regression problem and establish the asymptotic consistency of the proposed method. In particular, the manifold structure of the reduced-rank regression coefficient matrix is considered and studied in our theoretical analysis. In our simulation study and real data analysis, the new method is compared with several existing variable selection methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Journal Article
Spatial Homogeneity Pursuit of Regression Coefficients for Large Datasets
2019
Spatial regression models have been widely used to describe the relationship between a response variable and some explanatory variables over a region of interest, taking into account the spatial dependence of the observations. In many applications, relationships between response variables and covariates are expected to exhibit complex spatial patterns. We propose a new approach, referred to as spatially clustered coefficient (SCC) regression, to detect spatially clustered patterns in the regression coefficients. It incorporates spatial neighborhood information through a carefully constructed regularization to automatically detect change points in space and to achieve computational scalability. Our numerical studies suggest that SCC works very effectively, capturing not only clustered coefficients, but also smoothly varying coefficients because of its strong local adaptivity. This flexibility allows researchers to explore various spatial structures in regression coefficients. We also establish theoretical properties of SCC. We use SCC to explore the relationship between the temperature and salinity of sea water in the Atlantic basin; this can provide important insights about the evolution of individual water masses and the pathway and strength of meridional overturning circulation in oceanography.
Supplementary materials
for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Journal Article
ESTIMATION OF HIGH-DIMENSIONAL LOW-RANK MATRICES
2011
Suppose that we observe entries or, more generally, linear combinations of entries of an unknown m × T -matrix A corrupted by noise. We are particularly interested in the high-dimensional setting where the number mT of unknown entries can be much larger than the sample size N. Motivated by several applications, we consider estimation of matrix A under the assumption that it has small rank. This can be viewed as dimension reduction or sparsity assumption. In order to shrink toward a low-rank representation, we investigate penalized least squares estimators with a Schatten-p quasinorm penalty term, p ≤ 1. We study these estimators under two possible assumptions—a modified version of the restricted isometry condition and a uniform bound on the ratio \"empirical norm induced by the sampling operator/Frobenius norm.\" The main results are stated as nonasymptotic upper bounds on the prediction risk and on the Schatten-q risk of the estimators, where q ∈ [p, 2]. The rates that we obtain for the prediction risk are of the form rm/N (for m = T), up to logarithmic factors, where r is the rank of A. The particular examples of multi-task learning and matrix completion are worked out in detail. The proofs are based on tools from the theory of empirical processes. As a by-product, we derive bounds for the kth entropy numbers of the quasi-convex Schatten class embeddings $S_{p}^{M}\\hookrightarrow S_{2}^{M}$ , p < 1, which are of independent interest.
Journal Article
Scaled sparse linear regression
2012
Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs little beyond the computation of a path or grid of the sparse regression estimator for penalty levels above a proper threshold. For the scaled lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the scaled lasso simultaneously yields an estimator for the noise level and an estimated coefficient vector satisfying certain oracle inequalities for prediction, the estimation of the noise level and the regression coefficients. These inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise-level estimator, including certain cases where the number of variables is of greater order than the sample size. Parallel results are provided for least-squares estimation after model selection by the scaled lasso. Numerical results demonstrate the superior performance of the proposed methods over an earlier proposal of joint convex minimization.
Journal Article
Optimal Penalized Function-on-Function Regression Under a Reproducing Kernel Hilbert Space Framework
2018
Many scientific studies collect data where the response and predictor variables are both functions of time, location, or some other covariate. Understanding the relationship between these functional variables is a common goal in these studies. Motivated from two real-life examples, we present in this article a function-on-function regression model that can be used to analyze such kind of functional data. Our estimator of the 2D coefficient function is the optimizer of a form of penalized least squares where the penalty enforces a certain level of smoothness on the estimator. Our first result is the representer theorem which states that the exact optimizer of the penalized least squares actually resides in a data-adaptive finite-dimensional subspace although the optimization problem is defined on a function space of infinite dimensions. This theorem then allows us an easy incorporation of the Gaussian quadrature into the optimization of the penalized least squares, which can be carried out through standard numerical procedures. We also show that our estimator achieves the minimax convergence rate in mean prediction under the framework of function-on-function regression. Extensive simulation studies demonstrate the numerical advantages of our method over the existing ones, where a sparse functional data extension is also introduced. The proposed method is then applied to our motivating examples of the benchmark Canadian weather data and a histone regulation study. Supplementary materials for this article are available online.
Journal Article
REGULARIZATION AFTER RETENTION IN ULTRAHIGH DIMENSIONAL LINEAR REGRESSION MODELS
2019
In ultrahigh dimensional setting, independence screening has been both theoretically and empirically proved a useful variable selection framework with low computation cost. In this work, we propose a two-step framework using marginal information in a different fashion than independence screening. In particular, we retain significant variables rather than screening out irrelevant ones. The method is shown to be model selection consistent in the ultrahigh dimensional linear regression model. To improve the finite sample performance, we then introduce a three-step version and characterize its asymptotic behavior. Simulations and data analysis show advantages of our method over independence screening and its iterative variants in certain regimes.
Journal Article