Catalogue Search | MBRL

Practical Sketching Algorithms for Low-Rank Matrix Approximation

by Tropp, Joel A. , Udell, Madeleine , Cevher, Volkan

2017

Journal Article

Share this book

Add to My Shelf

CCCP is Frank-Wolfe in disguise

by Sra, Suvrit , Yurtsever, Alp in Asymptotic methods , Convergence

2022

This paper uncovers a simple but rather surprising connection: it shows that the well-known convex-concave procedure (CCCP) and its generalization to constrained problems are both special cases of the Frank-Wolfe (FW) method. This connection not only provides insight of deep (in our opinion) pedagogical value, but also transfers the recently discovered convergence theory of nonconvex Frank-Wolfe methods immediately to CCCP, closing a long-standing gap in its non-asymptotic convergence theory. We hope the viewpoint uncovered by this paper spurs the transfer of other advances made for FW to both CCCP and its generalizations.

Paper

Share this book

Add to My Shelf

Convex Formulations for Training Two-Layer ReLU Neural Networks

by Prakhya, Karthik , Yurtsever, Alp , Birdal, Tolga in Convexity , Machine learning , Neural networks

2024

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this challenge, we reformulate the problem of training infinite-width two-layer ReLU networks as a convex completely positive program in a finite-dimensional (lifted) space. Despite the convexity, solving this problem remains NP-hard due to the complete positivity constraint. To overcome this challenge, we introduce a semidefinite relaxation that can be solved in polynomial time. We then experimentally evaluate the tightness of this relaxation, demonstrating its competitive performance in test accuracy across a range of classification tasks.

Paper

Share this book

Add to My Shelf

A Variational Perspective on High-Resolution ODEs

by Maskan, Hoomaan , Zygalakis, Konstantinos C , Yurtsever, Alp in Algorithms , Euler-Lagrange equation , High resolution

2023

We consider unconstrained minimization of smooth convex functions. We propose a novel variational perspective using forced Euler-Lagrange equation that allows for studying high-resolution ODEs. Through this, we obtain a faster convergence rate for gradient norm minimization using Nesterov's accelerated gradient method. Additionally, we show that Nesterov's method can be interpreted as a rate-matching discretization of an appropriately chosen high-resolution ODE. Finally, using the results from the new variational perspective, we propose a stochastic method for noisy gradients. Several numerical experiments compare and illustrate our stochastic algorithm with state of the art methods.

Paper

Share this book

Add to My Shelf

Block Coordinate DC Programming

by Maskan, Hoomaan , Sra, Suvrit , Halvachi, Paniz in Algorithms , Asymptotic methods

2024

We introduce an extension of the Difference of Convex Algorithm (DCA) in the form of a block coordinate approach for problems with separable structure. For \\(n\\) coordinate-blocks and \\(k\\) iterations, our main result proves a non-asymptotic convergence rate of \\(O(n/k)\\) for the proposed method. Furthermore, leveraging the connection between DCA and Expectation Maximization (EM), we propose a block coordinate EM algorithm.

Paper

Share this book

Add to My Shelf

Federated Frank-Wolfe Algorithm

by Prakhya, Karthik , Yurtsever, Alp , Dadras, Ali in Algorithms , Cognitive tasks , Federated learning

2024

Federated learning (FL) has gained a lot of attention in recent years for building privacy-preserving collaborative learning systems. However, FL algorithms for constrained machine learning problems are still limited, particularly when the projection step is costly. To this end, we propose a Federated Frank-Wolfe Algorithm (FedFW). FedFW features data privacy, low per-iteration cost, and communication of sparse signals. In the deterministic setting, FedFW achieves an \\(\\varepsilon\\)-suboptimal solution within \\(O(\\varepsilon^{-2})\\) iterations for smooth and convex objectives, and \\(O(\\varepsilon^{-3})\\) iterations for smooth but non-convex objectives. Furthermore, we present a stochastic variant of FedFW and show that it finds a solution within \\(O(\\varepsilon^{-3})\\) iterations in the convex setting. We demonstrate the empirical performance of FedFW on several machine learning tasks.

Paper

Share this book

Add to My Shelf

Personalized Multi-tier Federated Learning

by Yurtsever, Alp , Bhuyan, Monowar , Banerjee, Sourasekhar in Cognitive tasks , Convergence , Customization

2024

The key challenge of personalized federated learning (PerFL) is to capture the statistical heterogeneity properties of data with inexpensive communications and gain customized performance for participating devices. To address these, we introduced personalized federated learning in multi-tier architecture (PerMFL) to obtain optimized and personalized local models when there are known team structures across devices. We provide theoretical guarantees of PerMFL, which offers linear convergence rates for smooth strongly convex problems and sub-linear convergence rates for smooth non-convex problems. We conduct numerical experiments demonstrating the robust empirical performance of PerMFL, outperforming the state-of-the-art in multiple personalized federated learning tasks.

Paper

Share this book

Add to My Shelf

Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization

by Yurtsever, Alp , Golyanik, Vladislav , Birdal, Tolga in Algorithms , Computer vision , Constraints

2022

We present a hybrid classical-quantum framework based on the Frank-Wolfe algorithm, Q-FW, for solving quadratic, linearly-constrained, binary optimization problems on quantum annealers (QA). The computational premise of quantum computers has cultivated the re-design of various existing vision problems into quantum-friendly forms. Experimental QA realizations can solve a particular non-convex problem known as the quadratic unconstrained binary optimization (QUBO). Yet a naive-QUBO cannot take into account the restrictions on the parameters. To introduce additional structure in the parameter space, researchers have crafted ad-hoc solutions incorporating (linear) constraints in the form of regularizers. However, this comes at the expense of a hyper-parameter, balancing the impact of regularization. To date, a true constrained solver of quadratic binary optimization (QBO) problems has lacked. Q-FW first reformulates constrained-QBO as a copositive program (CP), then employs Frank-Wolfe iterations to solve CP while satisfying linear (in)equality constraints. This procedure unrolls the original constrained-QBO into a set of unconstrained QUBOs all of which are solved, in a sequel, on a QA. We use D-Wave Advantage QA to conduct synthetic and real experiments on two important computer vision problems, graph matching and permutation synchronization, which demonstrate that our approach is effective in alleviating the need for an explicit regularization coefficient.

Paper

Share this book

Add to My Shelf

Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates

by Sra, Suvrit , Gu, Alex , Yurtsever, Alp in Adaptive learning , Convergence , Convexity

2022

Three Operator Splitting (TOS) (Davis & Yin, 2017) can minimize the sum of multiple convex functions effectively when an efficient gradient oracle or proximal operator is available for each term. This requirement often fails in machine learning applications: (i) instead of full gradients only stochastic gradients may be available; and (ii) instead of proximal operators, using subgradients to handle complex penalty functions may be more efficient and realistic. Motivated by these concerns, we analyze three potentially valuable extensions of TOS. The first two permit using subgradients and stochastic gradients, and are shown to ensure a \\(\\mathcal{O}(1/\\sqrt{t})\\) convergence rate. The third extension AdapTOS endows TOS with adaptive step-sizes. For the important setting of optimizing a convex loss over the intersection of convex sets AdapTOS attains universal convergence rates, i.e., the rate adapts to the unknown smoothness degree of the objective. We compare our proposed methods with competing methods on various applications.

Paper

Share this book

Add to My Shelf

Scalable Semidefinite Programming

by Tropp, Joel A , Fercoq, Olivier , Udell, Madeleine in Algorithms , Computational geometry , Convexity

2021

Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This paper develops a provably correct randomized algorithm for solving large, weakly constrained SDP problems by economizing on the storage and arithmetic costs. Numerical evidence shows that the method is effective for a range of applications, including relaxations of MaxCut, abstract phase retrieval, and quadratic assignment. Running on a laptop equivalent, the algorithm can handle SDP instances where the matrix variable has over \\(10^{14}\\) entries.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter