Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
36
result(s) for
"coordinate gradient descent"
Sort by:
GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using ℓ1-Penalization
by
Schelldorfer, Jürg
,
Meier, Lukas
,
Bühlmann, Peter
in
Algorithms
,
Approximation
,
Coordinate gradient descent
2014
We propose an ℓ
1
-penalized algorithm for fitting high-dimensional generalized linear mixed models (GLMMs). GLMMs can be viewed as an extension of generalized linear models for clustered observations. Our Lasso-type approach for GLMMs should be mainly used as variable screening method to reduce the number of variables below the sample size. We then suggest a refitting by maximum likelihood based on the selected variables only. This is an effective correction to overcome problems stemming from the variable screening procedure that are more severe with GLMMs than for generalized linear models. We illustrate the performance of our algorithm on simulated as well as on real data examples. Supplementary materials are available online and the algorithm is implemented in the R package
glmmixedlasso
.
Journal Article
An Accelerated Coordinate Gradient Descent Algorithm for Non-separable Composite Optimization
2022
Coordinate descent algorithms are popular in machine learning and large-scale data analysis problems due to their low computational cost iterative schemes and their improved performances. In this work, we define a monotone accelerated coordinate gradient descent-type method for problems consisting of minimizing f+g, where f is quadratic and g is nonsmooth and non-separable and has a low-complexity proximal mapping. The algorithm is enabled by employing the forward–backward envelope, a composite envelope that possess an exact smooth reformulation of f+g. We prove the algorithm achieves a convergence rate of O(1/k1.5) in terms of the original objective function, improving current coordinate descent-type algorithms. In addition, we describe an adaptive variant of the algorithm that backtracks the spectral information and coordinate Lipschitz constants of the problem. We numerically examine our algorithms on various settings, including two-dimensional total-variation-based image inpainting problems, showing a clear advantage in performance over current coordinate descent-type methods.
Journal Article
Estimation for High-Dimensional Linear Mixed-Effects Models Using ℓ1-Penalization
by
BÜHLMANN, PETER
,
SCHELLDORFER, JÜRG
,
DE GEER, SARA VAN
in
adaptive Lasso
,
Calculus of variations and optimal control
,
Coefficients
2011
We propose an ℓ 1 -penalized estimation procedure for high-dimensional linear mixedeffects models. The models are useful whenever there is a grouping structure among highdimensional observations, that is, for clustered data. We prove a consistency and an oracle optimality result and we develop an algorithm with provable numerical convergence. Furthermore, we demonstrate the performance of the method on simulated and a real high-dimensional data set.
Journal Article
Markov chain block coordinate descent
2020
The method of block coordinate gradient descent (BCD) has been a powerful method for large-scale optimization. This paper considers the BCD method that successively updates a series of blocks selected according to a Markov chain. This kind of block selection is neither i.i.d. random nor cyclic. On the other hand, it is a natural choice for some applications in distributed optimization and Markov decision process, where i.i.d. random and cyclic selections are either infeasible or very expensive. By applying mixing-time properties of a Markov chain, we prove convergence of Markov chain BCD for minimizing Lipschitz differentiable functions, which can be nonconvex. When the functions are convex and strongly convex, we establish both sublinear and linear convergence rates, respectively. We also present a method of Markov chain inertial BCD. Finally, we discuss potential applications.
Journal Article
UAV-Assisted Cooperative NOMA and OFDM Communication Systems: Analysis and Optimization
2024
Utilizing unmanned aerial vehicles (UAVs) to facilitate wireless communication has emerged as a viable and promising strategy to enhance current and prospective wireless systems. This approach offers many advantages by establishing line-of-sight connections, optimizing operational efficiency, and enabling flexible deployment capabilities in various terrains. Thus, in this paper, we investigate UAV communication in a relaying network in which a UAV helps communication between a source and two destination users while flying to a location. To have a complete view of our proposed system, we consider both orthogonal multiple access, such as OFDMs and non-orthogonal multiple access (NOMA) scenarios. Moreover, we apply successive convex optimization (SCO) and the block-coordinate gradient descent (BCGD) for the sum-rate optimization problems to improve the system performance under constraints of total bandwidth and total power at the ground base station and UAV. The experimental results validate that the achievable secrecy rates are notably enhanced using our proposed algorithms and show optimal trends for critical parameters, such as transmit powers, the flight trajectory and speed of the UAV, and resource allocation of OFDM and NOMA.
Journal Article
Structured feature selection using coordinate descent optimization
2016
Background
Existing feature selection methods typically do not consider prior knowledge in the form of structural relationships among features. In this study, the features are structured based on prior knowledge into groups. The problem addressed in this article is how to select one
representative
feature from each group such that the selected features are
jointly
discriminating the classes.
The problem is formulated as a binary constrained optimization and the combinatorial optimization is relaxed as a convex-concave problem, which is then transformed into a sequence of convex optimization problems so that the problem can be solved by any standard optimization algorithm. Moreover, a block coordinate gradient descent optimization algorithm is proposed for high dimensional feature selection, which in our experiments was four times faster than using a standard optimization algorithm.
Results
In order to test the effectiveness of the proposed formulation, we used microarray analysis as a case study, where genes with similar expressions or similar molecular functions were grouped together. In particular, the proposed block coordinate gradient descent feature selection method is evaluated on five benchmark microarray gene expression datasets and evidence is provided that the proposed method gives more accurate results than the state-of-the-art gene selection methods. Out of 25 experiments, the proposed method achieved the highest average AUC in 13 experiments while the other methods achieved higher average AUC in no more than 6 experiments.
Conclusion
A method is developed to select a feature from each group. When the features are grouped based on similarity in gene expression, we showed that the proposed algorithm is more accurate than state-of-the-art gene selection methods that are particularly developed to select highly discriminative and less redundant genes. In addition, the proposed method can exploit any grouping structure among features, while alternative methods are restricted to using similarity based grouping.
Journal Article
A coordinate gradient descent method for ℓ1-regularized convex minimization
2011
In applications such as signal processing and statistics, many problems involve finding sparse solutions to under-determined linear systems of equations. These problems can be formulated as a structured nonsmooth optimization problems, i.e., the problem of minimizing
ℓ
1
-regularized linear least squares problems. In this paper, we propose a block coordinate gradient descent method (abbreviated as CGD) to solve the more general
ℓ
1
-regularized convex minimization problems, i.e., the problem of minimizing an
ℓ
1
-regularized convex smooth function. We establish a Q-linear convergence rate for our method when the coordinate block is chosen by a Gauss-Southwell-type rule to ensure sufficient descent. We propose efficient implementations of the CGD method and report numerical results for solving large-scale
ℓ
1
-regularized linear least squares problems arising in compressed sensing and image deconvolution as well as large-scale
ℓ
1
-regularized logistic regression problems for feature selection in data classification. Comparison with several state-of-the-art algorithms specifically designed for solving large-scale
ℓ
1
-regularized linear least squares or logistic regression problems suggests that an efficiently implemented CGD method may outperform these algorithms despite the fact that the CGD method is not specifically designed just to solve these special classes of problems.
Journal Article
Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization
by
Yun, S.
,
Tseng, P.
in
Applications of Mathematics
,
Applied sciences
,
Calculus of Variations and Optimal Control; Optimization
2009
We consider the problem of minimizing the weighted sum of a smooth function
f
and a convex function
P
of
n
real variables subject to
m
linear equality constraints. We propose a block-coordinate gradient descent method for solving this problem, with the coordinate block chosen by a Gauss-Southwell-
q
rule based on sufficient predicted descent. We establish global convergence to first-order stationarity for this method and, under a local error bound assumption, linear rate of convergence. If
f
is convex with Lipschitz continuous gradient, then the method terminates in
O
(
n
2
/
ε
) iterations with an
ε
-optimal solution. If
P
is separable, then the Gauss-Southwell-
q
rule is implementable in
O
(
n
) operations when
m
=1 and in
O
(
n
2
) operations when
m
>1. In the special case of support vector machines training, for which
f
is convex quadratic,
P
is separable, and
m
=1, this complexity bound is comparable to the best known bound for decomposition methods. If
f
is convex, then, by gradually reducing the weight on
P
to zero, the method can be adapted to solve the bilevel problem of minimizing
P
over the set of minima of
f
+
δ
X
, where
X
denotes the closure of the feasible set. This has application in the least 1-norm solution of maximum-likelihood estimation.
Journal Article
On the Iteration Complexity of Cyclic Coordinate Gradient Descent Methods
2014
The cyclic (block) coordinate gradient descent method is an optimization method that has attracted much interest in applications of applied mathematics, statistics, and engineering. Reasons for this include its simplicity, speed, and stability, as well as its competitive performance on separable nonsmooth convex minimization problems, whose objective function is the sum of a smooth function and a separable (and possibly nonsmooth) convex function, such as the $\\ell_1$-regularized linear least squares problem and the $\\ell_1$-regularized logistic regression problem. But very little is known about the worst case iteration complexity of the method for solving the separable nonsmooth convex minimization problem. We prove that the method terminates in $O(1/\\epsilon)$ iterations with an $\\epsilon$-optimal solution, or equivalently, the convergence rate for it is $O(1/k)$, where $k$ is the iteration counter, when the smooth function of the objective has a Lipschitz gradient. Also, the linear rate convergence of the method is proved when the objective is a strongly convex function having a Lipschitz gradient or when the smooth function of the objective is a composition of a strong convex function having a Lipschitz gradient with a linear function, the convex function of the objective is polyhedral, and there is a real number whose corresponding level set of the convex function contains the set of optimal solutions and is bounded.
Journal Article
An Alternating Trust Region Algorithm for Distributed Linearly Constrained Nonlinear Programs, Application to the Optimal Power Flow Problem
by
Jones, Colin N.
,
Hours, Jean-Hubert
in
Algorithms
,
Applications of Mathematics
,
Calculus of Variations and Optimal Control; Optimization
2017
A novel trust region method for solving linearly constrained nonlinear programs is presented. The proposed technique is amenable to a distributed implementation, as its salient ingredient is an alternating projected gradient sweep in place of the Cauchy point computation. It is proven that the algorithm yields a sequence that globally converges to a critical point. As a result of some changes to the standard trust region method, namely a proximal regularisation of the trust region subproblem, it is shown that the local convergence rate is linear with an arbitrarily small ratio. Thus, convergence is locally almost superlinear, under standard regularity assumptions. The proposed method is successfully applied to compute local solutions to alternating current optimal power flow problems in transmission and distribution networks. Moreover, the new mechanism for computing a Cauchy point compares favourably against the standard projected search, as for its activity detection properties.
Journal Article