Catalogue Search | MBRL

by Tikhomirov, V. M. (Vladimir Mikhaĭlovich), 1934- author , Shenitzer, Abe translator , Tikhomirov, V. M. (Vladimir Mikhaĭlovich), 1934-. Rasskazy o maksimumakh i minimumakh in Maxima and minima , Calculus of variations , Mathematical optimization

1990

BOOK

Share this book

Add to My Shelf

Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks

by Kawaguchi, Kenji , Jagtap, Ameya D. , Em Karniadakis, George in accelerated training , bad minima , deep learning benchmarks

2020

We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an activation slope-based slope recovery term is added in the loss function, which further accelerates convergence, thereby reducing the training cost. On the theoretical side, we prove that in the proposed method, the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate, and that the gradient dynamics of the proposed method is not achievable by base methods with any (adaptive) learning rates. We further show that the adaptive activation methods accelerate the convergence by implicitly multiplying conditioning matrices to the gradient of the base method without any explicit computation of the conditioning matrix and the matrix–vector product. The different adaptive activation functions are shown to induce different implicit conditioning matrices. Furthermore, the proposed methods with the slope recovery are shown to accelerate the training process.

Journal Article

Share this book

Add to My Shelf

Covering Dimension of C-Algebras and 2-Coloured Classification

by White, Stuart , Brown, Nathanial P. , Bosa, Joan in C-algebras , Extremal problems (Mathematics) , Homomorphisms (Mathematics)

2019

The authors introduce the concept of finitely coloured equivalence for unital ^*-homomorphisms between \\mathrm C^*-algebras, for which unitary equivalence is the 1-coloured case. They use this notion to classify ^*-homomorphisms from separable, unital, nuclear \\mathrm C^*-algebras into ultrapowers of simple, unital, nuclear, \\mathcal Z-stable \\mathrm C^*-algebras with compact extremal trace space up to 2-coloured equivalence by their behaviour on traces; this is based on a 1-coloured classification theorem for certain order zero maps, also in terms of tracial data. As an application the authors calculate the nuclear dimension of non-AF, simple, separable, unital, nuclear, \\mathcal Z-stable \\mathrm C^*-algebras with compact extremal trace space: it is 1. In the case that the extremal trace space also has finite topological covering dimension, this confirms the remaining open implication of the Toms-Winter conjecture. Inspired by homotopy-rigidity theorems in geometry and topology, the authors derive a \"homotopy equivalence implies isomorphism\" result for large classes of \\mathrm C^*-algebras with finite nuclear dimension.

eBook

Share this book

Add to My Shelf

A mean field view of the landscape of two-layer neural networks

by Nguyen, Phan-Minh , Mei, Song , Montanari, Andrea in Artificial intelligence , Convergence , Learning algorithms

2018

Multilayer neural networks are among the most powerful models in machine learning, yet the fundamental reasons for this success defy mathematical understanding. Learning a neural network requires optimizing a nonconvex high-dimensional objective (risk function), a problem that is usually attacked using stochastic gradient descent (SGD). Does SGD converge to a global optimum of the risk or only to a local optimum? In the former case, does this happen because local minima are absent or because SGD some-how avoids them? In the latter, why do local minima reached by SGD have good generalization properties? In this paper, we consider a simple case, namely two-layer neural networks, and prove that—in a suitable scaling limit—SGD dynamics is captured by a certain nonlinear partial differential equation (PDE) that we call distributional dynamics (DD). We then consider several specific examples and show how DD can be used to prove convergence of SGD to networks with nearly ideal generalization error. This description allows for “averaging out” some of the complexities of the landscape of neural networks and can be used to prove a general convergence result for noisy SGD.

Journal Article

Share this book

Add to My Shelf

Complex Energy Landscapes in Spiked-Tensor and Simple Glassy Models: Ruggedness, Arrangements of Local Minima, and Phase Transitions

by Ben Arous, Gerard , Cammarota, Chiara , Ros, Valentina in Complex systems , Complexity , Configurations

2019

We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges. Such energy landscapes arise in glass physics and inference. In particular, we focus on random Gaussian functions and on the spiked-tensor model and generalizations. We thoroughly analyze the statistical properties of the corresponding landscapes and characterize the associated geometrical phase transitions. In order to perform our study, we develop a framework based on the Kac-Rice method that allows us to compute the complexity of the landscape, i.e., the logarithm of the typical number of stationary points and their Hessian. This approach generalizes the one used to compute rigorously the annealed complexity of mean-field glass models. We discuss its advantages with respect to previous frameworks, in particular, the thermodynamical replica method, which is shown to lead to partially incorrect predictions.

Journal Article

Share this book

Add to My Shelf

A new fusion of whale optimizer algorithm with Kapur’s entropy for multi-threshold image segmentation: analysis and validations

in Algorithms , Convergence , Entropy

2022

The separation of an object from other objects or the background by selecting the optimal threshold values remains a challenge in the field of image segmentation. Threshold segmentation is one of the most popular image segmentation techniques. The traditional methods for finding the optimum threshold are computationally expensive, tedious, and may be inaccurate. Hence, this paper proposes an Improved Whale Optimization Algorithm (IWOA) based on Kapur’s entropy for solving multi-threshold segmentation of the gray level image. Also, IWOA supports its performance using linearly convergence increasing and local minima avoidance technique (LCMA), and ranking-based updating method (RUM). LCMA technique accelerates the convergence speed of the solutions toward the optimal solution and tries to avoid the local minima problem that may fall within the optimization process. To do that, it updates randomly the positions of the worst solutions to be near to the best solution and at the same time randomly within the search space according to a certain probability to avoid stuck into local minima. Because of the randomization process used in LCMA for updating the solutions toward the best solutions, a huge number of the solutions around the best are skipped. Therefore, the RUM is used to replace the unbeneficial solution with a novel updating scheme to cover this problem. We compare IWOA with another seven algorithms using a set of well-known test images. We use several performance measures, such as fitness values, Peak Signal to Noise Ratio, Structured Similarity Index Metric, Standard Deviation, and CPU time.

Journal Article

Share this book

Add to My Shelf

A Novel Regional‐Minima Image Segmentation Method for Fluid Transport Simulations in Unresolved Rock Images

by Yang, Jianhui , Zhou, Yingfang , Zhang, Yuxuan in Accuracy , Carbonate rocks , Carbonates

2024

Unresolved digital rock images are often used to avoid high computational costs and limited field of views associated with processing fine‐resolution rock images. However, segmentation of unresolved images using classical methods is suboptimal due to the presence of the partial‐volume effect. Suboptimal segmentations can significantly influence the geometry and effective properties of the reconstructed models. This study reveals that partial‐volume pixels with high pore fractions remain as regional minima in intensity levels in unresolved images. By identifying these regional‐minima pixels, we can effectively extract pore space obscured by the partial‐volume effect. Based on this observation, we propose a novel segmentation method capable of identifying these regional‐minima partial‐volume pixels and converting them to pure pore pixels, thereby binarizing the digital rock images. The method is validated on sandstone and carbonate rock samples. Our method demonstrates a notable improvement in modeled permeability accuracy, surpassing 50% compared to the thresholding method and over 30% compared to the watershed method. Moreover, models segmented by this approach exhibit smaller pore and throat sizes compared to the substantially overestimated results obtained by classical methods. These findings suggest that the regional‐minima segmentation method effectively corrects for the partial‐volume effect and preserves more detailed pore structures. Consequently, it enhances the quality of binarized rock geometries, leading to improved accuracy in fluid‐flow simulations. Key Points We proposed a novel regional‐minima segmentation method designed for reconstructing pore spaces from unresolved images The method provides improved pore‐ and throat‐size distribution and flow paths The method enhances the accuracy of fluid‐flow simulations

Journal Article

Share this book

Add to My Shelf

Improved Artificial Potential Field Method Applied for AUV Path Planning

by Wei, Bowen , Fan, Xiaojing , Lyu, Wenhong in Algorithms , Autonomous underwater vehicles , Minima

2020

With the topics related to the intelligent AUV, control and navigation have become one of the key researching fields. This paper presents a concise and reliable path planning method for AUV based on the improved APF method. AUV can make the decision on obstacle avoidance in terms of the state of itself and the motion of obstacles. The artificial potential field (APF) method has been widely applied in static real-time path planning. In this study, we present the improved APF method to solve some inherent shortcomings, such as the local minima and the inaccessibility of the target. A distance correction factor is added to the repulsive potential field function to solve the GNRON problem. The regular hexagon-guided method is proposed to improve the local minima problem. Meanwhile, the relative velocity method about the moving objects detection and avoidance is proposed for the dynamic environment. This method considers not only the spatial location but also the magnitude and direction of the velocity of the moving objects, which can avoid dynamic obstacles in time. So the proposed path planning method is suitable for both static and dynamic environments. The virtual environment has been built, and the emulation has been in progress in MATLAB. Simulation results show that the proposed method has promising feasibility and efficiency in the AUV real-time path planning. We demonstrate the performance of the proposed method in the real environment. Experimental results show that the proposed method is capable of avoiding the obstacles efficiently and finding an optimized path.

Journal Article

Share this book

Add to My Shelf

The sparse(st) optimization problem: reformulations, optimality, stationarity, and numerical results

by Weiß, Felix , Schwartz, Alexandra , Kanzow, Christian in B-subdifferential , Global minima , Lagrange–Newton method

2025

We consider the sparse optimization problem with nonlinear constraints and an objective function, which is given by the sum of a general smooth mapping and an additional term defined by the ℓ0-quasi-norm. This term is used to obtain sparse solutions, but difficult to handle due to its nonconvexity and nonsmoothness (the sparsity-improving term is even discontinuous). The aim of this paper is to present two reformulations of this program as a smooth nonlinear program with complementarity-type constraints. We show that these programs are equivalent in terms of local and global minima and introduce a problem-tailored stationarity concept, which turns out to coincide with the standard KKT conditions of the two reformulated problems. In addition, a suitable constraint qualification as well as second-order conditions for the sparse optimization problem are investigated. These are then used to show that three Lagrange–Newton-type methods are locally fast convergent. Numerical results on different classes of test problems indicate that these methods can be used to drastically improve sparse solutions obtained by some other (globally convergent) methods for sparse optimization problems.

Journal Article

Share this book

Add to My Shelf

Gradient descent optimizes over-parameterized deep ReLU networks

by Gu Quanquan , Zou Difan , Zhou Dongruo in Artificial neural networks , Convergence , Entropy of activation

2020

We study the problem of training deep fully connected neural networks with Rectified Linear Unit (ReLU) activation function and cross entropy loss function for binary classification using gradient descent. We show that with proper random weight initialization, gradient descent can find the global minima of the training loss for an over-parameterized deep ReLU network, under certain assumption on the training data. The key idea of our proof is that Gaussian random initialization followed by gradient descent produces a sequence of iterates that stay inside a small perturbation region centered at the initial weights, in which the training loss function of the deep ReLU networks enjoys nice local curvature properties that ensure the global convergence of gradient descent. At the core of our proof technique is (1) a milder assumption on the training data; (2) a sharp analysis of the trajectory length for gradient descent; and (3) a finer characterization of the size of the perturbation region. Compared with the concurrent work (Allen-Zhu et al. in A convergence theory for deep learning via over-parameterization, 2018a; Du et al. in Gradient descent finds global minima of deep neural networks, 2018a) along this line, our result relies on milder over-parameterization condition on the neural network width, and enjoys faster global convergence rate of gradient descent for training deep neural networks.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter