Catalogue Search | MBRL

Minimizing finite sums with the stochastic average gradient

by Le Roux, Nicolas , Bach, Francis , Schmidt, Mark in Algorithms , Calculus of Variations and Optimal Control; Optimization , Combinatorics

2017

We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method’s iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O ( 1 / k ) to O (1 / k ) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O (1 / k ) to a linear convergence rate of the form O ( ρ k ) for ρ < 1 . Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work Le Roux et al. (Adv Neural Inf Process Syst, 2012 ), which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

Journal Article

Share this book

Add to My Shelf

Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods

by Bolte, Jérôme , Attouch, Hedy , Svaiter, Benar Fux in Algebra , Algorithms , Analysis

2013

In view of the minimization of a nonsmooth nonconvex function f , we prove an abstract convergence result for descent methods satisfying a sufficient-decrease assumption, and allowing a relative error tolerance. Our result guarantees the convergence of bounded sequences, under the assumption that the function f satisfies the Kurdyka–Łojasiewicz inequality. This assumption allows to cover a wide range of problems, including nonsmooth semi-algebraic (or more generally tame) minimization. The specialization of our result to different kinds of structured problems provides several new convergence results for inexact versions of the gradient method, the proximal method, the forward–backward splitting algorithm, the gradient projection and some proximal regularization of the Gauss–Seidel method in a nonconvex setting. Our results are illustrated through feasibility problems, or iterative thresholding procedures for compressive sensing.

Journal Article

Share this book

Add to My Shelf

Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity

by Attouch, Hedy , Peypouquet, Juan , Redont, Patrick in Algorithms , Convergence , Differential equations

2018

In a Hilbert space setting H, we study the fast convergence properties as t→+∞ of the trajectories of the second-order differential equation x¨(t)+αtx˙(t)+∇Φ(x(t))=g(t),where ∇Φ is the gradient of a convex continuously differentiable function Φ:H→R,α is a positive parameter, and g:[t0,+∞[→H is a small perturbation term. In this inertial system, the viscous damping coefficient αt vanishes asymptotically, but not too rapidly. For α≥3, and ∫t0+∞t‖g(t)‖dt<+∞, just assuming that argminΦ≠∅, we show that any trajectory of the above system satisfies the fast convergence property Φ(x(t))-minHΦ≤Ct2.Moreover, for α>3, any trajectory converges weakly to a minimizer of Φ. The strong convergence is established in various practical situations. These results complement the O(t-2) rate of convergence for the values obtained by Su, Boyd and Candès in the unperturbed case g=0. Time discretization of this system, and some of its variants, provides new fast converging algorithms, expanding the field of rapid methods for structured convex minimization introduced by Nesterov, and further developed by Beck and Teboulle with FISTA. This study also complements recent advances due to Chambolle and Dossal.

Journal Article

Share this book

Add to My Shelf

Automated tight Lyapunov analysis for first-order methods

by Upadhyaya, Manu , Taylor, Adrien B , Giselsson, Pontus in Convergence , Convexity , Inequalities

2025

We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on state-space form in feedback interconnection with the subdifferentials of the functional components of the objective function, and (iii) quadratic Lyapunov inequalities that can be used to draw convergence conclusions. We present a necessary and sufficient condition for the existence of a quadratic Lyapunov inequality within a predefined class of Lyapunov inequalities, which amounts to solving a small-sized semidefinite program. We showcase our methodology on several first-order methods that fit the framework. Most notably, our methodology allows us to significantly extend the region of parameter choices that allow for duality gap convergence in the Chambolle-Pock method when the linear operator is the identity mapping.

Paper

Share this book

Add to My Shelf

Proximal alternating linearized minimization for nonconvex and nonsmooth problems

by Bolte, Jérôme , Teboulle, Marc , Sabach, Shoham in Algorithms , Analysis , Byproducts

2014

We introduce a proximal alternating linearized minimization (PALM) algorithm for solving a broad class of nonconvex and nonsmooth minimization problems. Building on the powerful Kurdyka–Łojasiewicz property, we derive a self-contained convergence analysis framework and establish that each bounded sequence generated by PALM globally converges to a critical point. Our approach allows to analyze various classes of nonconvex-nonsmooth problems and related nonconvex proximal forward–backward algorithms with semi-algebraic problem’s data, the later property being shared by many functions arising in a wide variety of fundamental applications. A by-product of our framework also shows that our results are new even in the convex setting. As an illustration of the results, we derive a new and simple globally convergent algorithm for solving the sparse nonnegative matrix factorization problem.

Journal Article

Share this book

Add to My Shelf

A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms

by Condat, Laurent in Algorithms , Applications of Mathematics , Calculus of Variations and Optimal Control; Optimization

2013

We propose a new first-order splitting algorithm for solving jointly the primal and dual formulations of large-scale convex minimization problems involving the sum of a smooth function with Lipschitzian gradient, a nonsmooth proximable function, and linear composite functions. This is a full splitting approach, in the sense that the gradient and the linear operators involved are applied explicitly without any inversion, while the nonsmooth functions are processed individually via their proximity operators. This work brings together and notably extends several classical splitting schemes, like the forward–backward and Douglas–Rachford methods, as well as the recent primal–dual method of Chambolle and Pock designed for problems with linear composite terms.

Journal Article

Share this book

Add to My Shelf

Phase recovery, MaxCut and complex semidefinite programming

by Waldspurger, Irène , d’Aspremont, Alexandre , Mallat, Stéphane in Algorithms , Amplitudes , Analysis

2015

Phase retrieval seeks to recover a signal x ∈ C p from the amplitude | A x | of linear measurements A x ∈ C n . We cast the phase retrieval problem as a non-convex quadratic program over a complex phase vector and formulate a tractable relaxation (called PhaseCut ) similar to the classical MaxCut semidefinite program. We solve this problem using a provably convergent block coordinate descent algorithm whose structure is similar to that of the original greedy algorithm in Gerchberg and Saxton (Optik 35:237–246, 1972 ), where each iteration is a matrix vector product. Numerical results show the performance of this approach over three different phase retrieval problems, in comparison with greedy phase retrieval algorithms and matrix completion formulations.

Journal Article

Share this book

Add to My Shelf

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

by Michael O’Neill , Royer, Clément W , Wright, Stephen J in Algorithms , Complexity , Conjugates

2020

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton’s method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of the objective function. The algorithm tracks Newton-conjugate gradient procedures developed in the 1980s closely, but includes enhancements that allow worst-case complexity results to be proved for convergence to points that satisfy approximate first-order and second-order optimality conditions. The complexity results match the best known results in the literature for second-order methods.

Journal Article

Share this book

Add to My Shelf

The Pontryagin Maximum Principle in the Wasserstein Space

by Rossi, Francesco , Bonnet, Benoît in Analysis , Calculus of Variations and Optimal Control; Optimization , Control

2019

We prove a Pontryagin Maximum Principle for optimal control problems in the space of probability measures, where the dynamics is given by a transport equation with non-local velocity. We formulate this first-order optimality condition using the formalism of subdifferential calculus in Wasserstein spaces. We show that the geometric approach based on needle variations and on the evolution of the covector (here replaced by the evolution of a mesure on the dual space) can be translated into this formalism.

Journal Article

Share this book

Add to My Shelf

Convergence rate of inertial Forward–Backward algorithm beyond Nesterov’s rule

by Aujol, Jean-François , Apidopoulos, Vassilis , Dossal, Charles in Algorithms , Convergence , Nonlinear programming

2020

In this paper we study the convergence of an Inertial Forward–Backward algorithm, with a particular choice of an over-relaxation term. In particular we show that for a sequence of over-relaxation parameters, that do not satisfy Nesterov’s rule, one can still expect some relatively fast convergence properties for the objective function. In addition we complement this work by studying the convergence of the algorithm in the case where the proximal operator is inexactly computed with the presence of some errors and we give sufficient conditions over these errors in order to obtain some convergence properties for the objective function.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter