Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
1,490 result(s) for "second-order methods"
Sort by:
Optimization Methods for Large-Scale Machine Learning
This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of our study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient (SG) method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter. Based on this viewpoint, we present a comprehensive theory of a straightforward, yet versatile SG algorithm, discuss its practical behavior, and highlight opportunities for designing algorithms with improved performance. This leads to a discussion about the next generation of optimization methods for large-scale machine learning, including an investigation of two main streams of research on techniques that diminish noise in the stochastic directions and methods that make use of second-order derivative approximations.
Hybrid Second Order Method for Orthogonal Projection onto Parametric Curve in n-Dimensional Euclidean Space
Orthogonal projection a point onto a parametric curve, three classic first order algorithms have been presented by Hartmann (1999), Hoschek, et al. (1993) and Hu, et al. (2000) (hereafter, H-H-H method). In this research, we give a proof of the approach’s first order convergence and its non-dependence on the initial value. For some special cases of divergence for the H-H-H method, we combine it with Newton’s second order method (hereafter, Newton’s method) to create the hybrid second order method for orthogonal projection onto parametric curve in an n-dimensional Euclidean space (hereafter, our method). Our method essentially utilizes hybrid iteration, so it converges faster than current methods with a second order convergence and remains independent from the initial value. We provide some numerical examples to confirm robustness and high efficiency of the method.
Superfast Second-Order Methods for Unconstrained Convex Optimization
In this paper, we present new second-order methods with convergence rate Ok-4, where k is the iteration counter. This is faster than the existing lower bound for this type of schemes (Agarwal and Hazan in Proceedings of the 31st conference on learning theory, PMLR, pp. 774–792, 2018; Arjevani and Shiff in Math Program 178(1–2):327–360, 2019), which is Ok-7/2. Our progress can be explained by a finer specification of the problem class. The main idea of this approach consists in implementation of the third-order scheme from Nesterov (Math Program 186:157–183, 2021) using the second-order oracle. At each iteration of our method, we solve a nontrivial auxiliary problem by a linearly convergent scheme based on the relative non-degeneracy condition (Bauschke et al. in Math Oper Res 42:330–348, 2016; Lu et al. in SIOPT 28(1):333–354, 2018). During this process, the Hessian of the objective function is computed once, and the gradient is computed Oln1ϵ times, where ϵ is the desired accuracy of the solution for our problem.
Exploiting negative curvature in deterministic and stochastic optimization
This paper addresses the question of whether it can be beneficial for an optimization algorithm to follow directions of negative curvature. Although prior work has established convergence results for algorithms that integrate both descent and negative curvature steps, there has not yet been extensive numerical evidence showing that such methods offer consistent performance improvements. In this paper, we present new frameworks for combining descent and negative curvature directions: alternating two-step approaches and dynamic step approaches. The aspect that distinguishes our approaches from ones previously proposed is that they make algorithmic decisions based on (estimated) upper-bounding models of the objective function. A consequence of this aspect is that our frameworks can, in theory, employ fixed stepsizes, which makes the methods readily translatable from deterministic to stochastic settings. For deterministic problems, we show that instances of our dynamic framework yield gains in performance compared to related methods that only follow descent steps. We also show that gains can be made in a stochastic setting in cases when a standard stochastic-gradient-type method might make slow progress.
Second Order Fully Semi-Lagrangian Discretizations of Advection-Diffusion-Reaction Systems
We propose a second order, fully semi-Lagrangian method for the numerical solution of systems of advection-diffusion-reaction equations, which is based on a semi-Lagrangian approach to approximate in time both the advective and the diffusive terms. The proposed method allows to use large time steps, while avoiding the solution of large linear systems, which would be required by implicit time discretization techniques. Standard interpolation procedures are used for the space discretization on structured and unstructured meshes. A novel extrapolation technique is proposed to enforce second-order accurate Dirichlet boundary conditions. We include a theoretical analysis of the scheme, along with numerical experiments which demonstrate the effectiveness of the proposed approach and its superior efficiency with respect to more conventional explicit and implicit time discretizations.
A non-monotone trust-region method with noisy oracles and additional sampling
In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies that yield noisy approximations of the finite sum objective function and its gradient. We introduce an adaptive sample size strategy based on inexpensive additional sampling to control the resulting approximation error. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.
Finding Second-Order Stationary Points in Constrained Minimization: A Feasible Direction Approach
This paper introduces a method for computing points satisfying the second-order necessary optimality conditions for nonconvex minimization problems subject to a closed and convex constraint set. The method comprises two independent steps corresponding to the first- and second-order conditions. The first-order step is a generic closed map algorithm, which can be chosen from a variety of first-order algorithms, making it adjustable to the given problem. The second-order step can be viewed as a second-order feasible direction step for nonconvex minimization subject to a convex set. We prove that any limit point of the resulting scheme satisfies the second-order necessary optimality condition, and establish the scheme’s convergence rate and complexity, under standard and mild assumptions. Numerical tests illustrate the proposed scheme.
A second-order method for strongly convex ...-regularization problems
(ProQuest: ... denotes formulae and/or non-USASCII text omitted; see image).In this paper a robust second-order method is developed for the solution of strongly convex ...-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed approach is a primal-dual Newton conjugate gradients (pdNCG) method. Convergence properties of pdNCG are studied and worst-case iteration complexity is established. Numerical results are presented on synthetic sparse least-squares problems and real world machine learning problems.
A cubic regularization of Newton’s method with finite difference Hessian approximations
In this paper, we present a version of the cubic regularization of Newton’s method for unconstrained nonconvex optimization, in which the Hessian matrices are approximated by forward finite difference Hessians. The regularization parameter of the cubic models and the accuracy of the Hessian approximations are jointly adjusted using a nonmonotone line search criterion. Assuming that the Hessian of the objective function is globally Lipschitz continuous, we show that the proposed method needs at most O n 𝜖 − 3 / 2 function and gradient evaluations to generate an 𝜖 -approximate stationary point, where n is the dimension of the domain of the objective function. Preliminary numerical results corroborate our theoretical findings.