Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
35 result(s) for "gradient-free optimization"
Sort by:
Wake expansion continuation: Multi‐modality reduction in the wind farm layout optimization problem
In this paper we present a continuation optimization method for reducing multi‐modality in the wind farm layout optimization problem that we call wake expansion continuation (WEC). We achieve the reduction in multi‐modality by starting with an increased wake diameter while maintaining normal velocity deficits at the center of the wakes, and then reducing the wake diameter for each of a series of optimization runs until the accurate wake diameter is used. We applied and demonstrated the effectiveness of WEC with two different wake models. We tested WEC on four optimization case studies and compared the results with a gradient‐based optimization method and a gradient‐free optimization method. We found a significant improvement in the mean, standard deviation, and minimum wake loss for optimization with WEC compared to optimization without WEC for all test cases. We found the gradient‐free optimization algorithm resulted in less optimal layouts on average for all cases than the gradient‐based algorithm with WEC. We also applied WEC to the gradient‐free algorithm for one case study with significantly improved results, but there was more improvement when we applied WEC to a gradient‐based algorithm. WEC enables gradient‐based algorithms to search the wind farm layout optimization space more globally and provides more optimal results more consistently than optimization without WEC.
Survey of Optimization Algorithms in Modern Neural Networks
The main goal of machine learning is the creation of self-learning algorithms in many areas of human activity. It allows a replacement of a person with artificial intelligence in seeking to expand production. The theory of artificial neural networks, which have already replaced humans in many problems, remains the most well-utilized branch of machine learning. Thus, one must select appropriate neural network architectures, data processing, and advanced applied mathematics tools. A common challenge for these networks is achieving the highest accuracy in a short time. This problem is solved by modifying networks and improving data pre-processing, where accuracy increases along with training time. Bt using optimization methods, one can improve the accuracy without increasing the time. In this review, we consider all existing optimization algorithms that meet in neural networks. We present modifications of optimization algorithms of the first, second, and information-geometric order, which are related to information geometry for Fisher–Rao and Bregman metrics. These optimizers have significantly influenced the development of neural networks through geometric and probabilistic tools. We present applications of all the given optimization algorithms, considering the types of neural networks. After that, we show ways to develop optimization algorithms in further research using modern neural networks. Fractional order, bilevel, and gradient-free optimizers can replace classical gradient-based optimizers. Such approaches are induced in graph, spiking, complex-valued, quantum, and wavelet neural networks. Besides pattern recognition, time series prediction, and object detection, there are many other applications in machine learning: quantum computations, partial differential, and integrodifferential equations, and stochastic processes.
GradFreeBits: Gradient-Free Bit Allocation for Mixed-Precision Neural Networks
Quantized neural networks (QNNs) are among the main approaches for deploying deep neural networks on low-resource edge devices. Training QNNs using different levels of precision throughout the network (mixed-precision quantization) typically achieves superior trade-offs between performance and computational load. However, optimizing the different precision levels of QNNs can be complicated, as the values of the bit allocations are discrete and difficult to differentiate for. Moreover, adequately accounting for the dependencies between the bit allocation of different layers is not straightforward. To meet these challenges, in this work, we propose GradFreeBits: a novel joint optimization scheme for training mixed-precision QNNs, which alternates between gradient-based optimization for the weights and gradient-free optimization for the bit allocation. Our method achieves a better or on par performance with the current state-of-the-art low-precision classification networks on CIFAR10/100 and ImageNet, semantic segmentation networks on Cityscapes, and several graph neural networks benchmarks. Furthermore, our approach can be extended to a variety of other applications involving neural networks used in conjunction with parameters that are difficult to optimize for.
An adaptive Bayesian approach to gradient-free global optimization
Many problems in science and technology require finding global minima or maxima of complicated objective functions. The importance of global optimization has inspired the development of numerous heuristic algorithms based on analogies with physical, chemical or biological systems. Here we present a novel algorithm, SmartRunner, which employs a Bayesian probabilistic model informed by the history of accepted and rejected moves to make an informed decision about the next random trial. Thus, SmartRunner intelligently adapts its search strategy to a given objective function and moveset, with the goal of maximizing fitness gain (or energy loss) per function evaluation. Our approach is equivalent to adding a simple adaptive penalty to the original objective function, with SmartRunner performing hill ascent on the modified landscape. The adaptive penalty can be added to many other global optimization schemes, enhancing their ability to find high-quality solutions. We have explored SmartRunner’s performance on a standard set of test functions, the Sherrington–Kirkpatrick spin glass model, and Kauffman’s NK fitness model, finding that it compares favorably with several widely-used alternative approaches to gradient-free optimization.
Tensor Network Modeling of Electronic Structure of Semiconductor Nanoparticles and Sensory Effect of Layers Based on Them
This paper develops mathematical apparatus for the modeling of the electronic structure of semiconductor nanoparticles and the description of sensor response of the layers constructed on their base. The developed technique involves solutions of both the direct and inverse problems. The direct problem involves of the two coupled sets of differential equations, at fixed values of physical parameters. The first of them is the set of equations of chemical kinetics which describes processes occurring at the surface of a nanoparticle. The second involves an equation describing electron concentration distribution inside a nanoparticle. The inverse problem consists of the determination of physical parameters (essentially, reactions rate constants) which provide a good approximation of experimental data when using them to find the solution of the direct problem. The mathematical novelty of this paper is the application of—for the first time, to find the solution of the inverse problem—the new gradient-free optimization methods based on low-rank tensor train decomposition and modern machine learning paradigm. Sensor effect was measured in a dedicated set of experiments. Comparisons of computed and experimental data on sensor effect were carried out and demonstrated sufficiently good agreement.
Objective and algorithm considerations when optimizing the number and placement of turbines in a wind power plant
Optimizing turbine layout is a challenging problem that has been extensively researched in the literature. However, optimizing the number of turbines within a given boundary has not been studied as extensively and is a difficult problem because it introduces discrete design variables and a discontinuous design space. An essential step in performing wind power plant layout optimization is to define the objective function, or value, that is used to express what is valuable to a wind power plant developer, such as annual energy production, cost of energy, or profit. In this paper, we demonstrate the importance of selecting the appropriate objective function when optimizing a wind power plant in a land-constrained site. We optimized several different wind power plants with different wind resources and boundary sizes. Results show that the optimal number of turbines varies drastically depending on the objective function. For a simple, one-dimensional, land-based scenario, we found that a wind power plant optimized for minimal cost of energy produced just 72 % of the profit compared to the wind power plant optimized for maximum profit, which corresponded to a loss of about USD 2 million each year. This paper also compares the performance of several different optimization algorithms, including a novel repeated-sweep algorithm that we developed. We found that the performance of each algorithm depended on the number of design variables in the problem as well as the objective function.
Training multi-layer binary neural networks with random local binary error signals
Binary neural networks (BNNs) significantly reduce computational complexity and memory usage in machine and deep learning by representing weights and activations with just one bit. However, most existing training algorithms for BNNs rely on quantization-aware floating-point stochastic gradient descent (SGD), limiting the full exploitation of binary operations to the inference phase only. In this work, we propose, for the first time, a fully binary and gradient-free training algorithm for multi-layer BNNs, eliminating the need for back-propagated floating-point gradients. Specifically, the proposed algorithm relies on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism, thereby enhancing its neurobiological plausibility. Our proposed solution enables the training of binary multi-layer perceptrons by using exclusively XNOR, Popcount, and increment/decrement operations. Experimental results on multi-class classification benchmarks show test accuracy improvements of up to +35.47% over the only existing fully binary single-layer state-of-the-art solution. Compared to full-precision SGD, our solution improves test accuracy by up to +35.30% under the same total memory demand, while also reducing computational cost by two to three orders of magnitude in terms of the total number of Boolean gates. The proposed algorithm is made available to the scientific community as a public repository.
Variable metric random pursuit
We consider unconstrained randomized optimization of smooth convex objective functions in the gradient-free setting. We analyze Random Pursuit (RP) algorithms with fixed (F-RP) and variable metric (V-RP). The algorithms only use zeroth-order information about the objective function and compute an approximate solution by repeated optimization over randomly chosen one-dimensional subspaces. The distribution of search directions is dictated by the chosen metric. Variable Metric RP uses novel variants of a randomized zeroth-order Hessian approximation scheme recently introduced by Leventhal and Lewis (Optimization 60(3):329–345, 2011 . doi: 10.1080/02331930903100141 ). We here present (1) a refined analysis of the expected single step progress of RP algorithms and their global convergence on (strictly) convex functions and (2) novel convergence bounds for V-RP on strongly convex functions. We also quantify how well the employed metric needs to match the local geometry of the function in order for the RP algorithms to converge with the best possible rate. Our theoretical results are accompanied by numerical experiments, comparing V-RP with the derivative-free schemes CMA-ES, Implicit Filtering, Nelder–Mead, NEWUOA, Pattern-Search and Nesterov’s gradient-free algorithms.
A Gradient-Free Topology Optimization Strategy for Continuum Structures with Design-Dependent Boundary Loads
In this paper, the topology optimization of continuum structures with design-dependent loads is studied with a gradient-free topology optimization method in combination with adaptive body-fitted finite element mesh. The material-field series-expansion (MFSE) model represents the structural topology using a bounded material field with specified spatial correlation and provides a crisp structural boundary description. This feature makes it convenient to identify the loading surface for the application of the design-dependent boundary loads and to generate a body-fitted mesh for structural analysis. Using the dimension reduction technique, the number of design variables is significantly decreased, which enables the use of an efficient Kriging-based algorithm to solve the topology optimization problem. The effectiveness of the proposed method is demonstrated using several numerical examples, among which a design problem with geometry and contact nonlinearity is included.
Optimal control of buoyancy-driven liquid steel stirring modeled with single-phase Navier–Stokes equations
Gas stirring is an important process used in secondary metallurgy. It allows to homogenize the temperature and the chemical composition of the liquid steel and to remove inclusions which can be detrimental for the end-product quality. In this process, argon gas is injected from two nozzles at the bottom of the vessel and rises by buoyancy through the liquid steel thereby causing stirring, i.e., a mixing of the bath. The gas flow rates and the positions of the nozzles are two important control parameters in practice. A continuous optimization approach is pursued to find optimal values for these control variables. The effect of the gas appears as a volume force in the single-phase incompressible Navier–Stokes equations. Turbulence is modeled with the Smagorinsky Large Eddy Simulation (LES) model. An objective functional based on the vorticity is used to describe the mixing in the liquid bath. Optimized configurations are compared with a default one whose design is based on a setup from industrial practice.