Catalogue Search | MBRL

A survey on multi-objective hyperparameter optimization algorithms for machine learning

in Algorithms , Artificial intelligence , Datasets

2023

Hyperparameter optimization (HPO) is a necessary step to ensure the best possible performance of Machine Learning (ML) algorithms. Several methods have been developed to perform HPO; most of these are focused on optimizing one performance measure (usually an error-based measure), and the literature on such single-objective HPO problems is vast. Recently, though, algorithms have appeared that focus on optimizing multiple conflicting objectives simultaneously. This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms, distinguishing between metaheuristic-based algorithms, metamodel-based algorithms and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.

Journal Article

Share this book

Add to My Shelf

Hyperparameter Bayesian Optimization of Gaussian Process Regression Applied in Speed-Sensorless Predictive Torque Control of an Autonomous Wind Energy Conversion System

by Yanis Hamoudi , Maher G. M. Abdolrasol , Hocine Amimeur in Algorithms , Alternative energy sources , Buildings and facilities

2023

This paper introduces a novel approach to speed-sensorless predictive torque control (PTC) in an autonomous wind energy conversion system, specifically utilizing an asymmetric double star induction generator (ADSIG). To achieve accurate estimation of non-linear quantities, the Gaussian Process Regression algorithm (GPR) is employed as a powerful machine learning tool for designing speed and flux estimators. To enhance the capabilities of the GPR, two improvements were implemented, (a) hyperparametric optimization through the Bayesian optimization (BO) algorithm and (b) curation of the input vector using the gray box concept, leveraging our existing knowledge of the ADSIG. Simulation results have demonstrated that the proposed GPR-PTC would remain robust and unaffected by the absence of a speed sensor, maintaining performance even under varying magnetizing inductance. This enables a reliable and cost-effective control solution.

Journal Article

Share this book

Add to My Shelf

The role of hyperparameters in machine learning models and how to tune them

by Biedebach, Luka , Küpfer, Andreas , Arnold, Christian in Ability , Computer science , Documentation

2024

Hyperparameters critically influence how well machine learning models perform on unseen, out-of-sample data. Systematically comparing the performance of different hyperparameter settings will often go a long way in building confidence about a model's performance. However, analyzing 64 machine learning related manuscripts published in three leading political science journals (APSR, PA, and PSRM) between 2016 and 2021, we find that only 13 publications (20.31 percent) report the hyperparameters and also how they tuned them in either the paper or the appendix. We illustrate the dangers of cursory attention to model and tuning transparency in comparing machine learning models’ capability to predict electoral violence from tweets. The tuning of hyperparameters and their documentation should become a standard component of robustness checks for machine learning models.

Journal Article

Share this book

Add to My Shelf

Modified particle swarm optimization (MPSO) optimized CNN’s hyperparameters for classification

by Winiarti, Sri , Murinto, Murinto in batik motif , classification , hyperparameter optimization

2025

This paper proposes a convolutional neural network architectural design approach using the modified particle swarm optimization (MPSO) algorithm. Adjusting hyper-parameters and searching for optimal network architecture from convolutional neural networks (CNN) is an interesting challenge. Network performance and increasing the efficiency of learning models on certain problems depend on setting hyperparameter values, resulting in large and complex search spaces in their exploration. The use of heuristic-based searches allows for this type of problem, where the main contribution in this research is to apply the MPSO algorithm to find the optimal parameters of CNN, including the number of convolution layers, the filters used in the convolution process, the number of convolution filters and the batch size. The parameters obtained using MPSO are kept in the same condition in each convolution layer, and the objective function is evaluated by MPSO, which is given by classification rate. The optimized architecture is implemented in the Batik motif database. The research found that the proposed model produced the best results, with a classification rate higher than 94%, showing good results compared to other state-of-the-art approaches. This research demonstrates the performance of the MPSO algorithm in optimizing CNN architectures, highlighting its potential for improving image recognition tasks.

Journal Article

Share this book

Add to My Shelf

Plant Disease Detection Using Deep Convolutional Neural Network

by Pandian, J. , Geman, Oana , Kanchanadevi, K. in basic image manipulation , deep convolutional neural networks , generative adversarial network

2022

In this research, we proposed a novel 14-layered deep convolutional neural network (14-DCNN) to detect plant leaf diseases using leaf images. A new dataset was created using various open datasets. Data augmentation techniques were used to balance the individual class sizes of the dataset. Three image augmentation techniques were used: basic image manipulation (BIM), deep convolutional generative adversarial network (DCGAN) and neural style transfer (NST). The dataset consists of 147,500 images of 58 different healthy and diseased plant leaf classes and one no-leaf class. The proposed DCNN model was trained in the multi-graphics processing units (MGPUs) environment for 1000 epochs. The random search with the coarse-to-fine searching technique was used to select the most suitable hyperparameter values to improve the training performance of the proposed DCNN model. On the 8850 test images, the proposed DCNN model achieved 99.9655% overall classification accuracy, 99.7999% weighted average precision, 99.7966% weighted average recall, and 99.7968% weighted average F1 score. Additionally, the overall performance of the proposed DCNN model was better than the existing transfer learning approaches.

Journal Article

Share this book

Add to My Shelf

Scalable Gaussian process-based transfer surrogates for hyperparameter optimization

by Schilling, Nicolas , Wistuba, Martin , Schmidt-Thieme, Lars in Artificial intelligence , Covariance matrix , Datasets

2018

Algorithm selection as well as hyperparameter optimization are tedious task that have to be dealt with when applying machine learning to real-world problems. Sequential model-based optimization (SMBO), based on so-called “surrogate models”, has been employed to allow for faster and more direct hyperparameter optimization. A surrogate model is a machine learning regression model which is trained on the meta-level instances in order to predict the performance of an algorithm on a specific data set given the hyperparameter settings and data set descriptors. Gaussian processes, for example, make good surrogate models as they provide probability distributions over labels. Recent work on SMBO also includes meta-data, i.e. observed hyperparameter performances on other data sets, into the process of hyperparameter optimization. This can, for example, be accomplished by learning transfer surrogate models on all available instances of meta-knowledge; however, the increasing amount of meta-information can make Gaussian processes infeasible, as they require the inversion of a large covariance matrix which grows with the number of instances. Consequently, instead of learning a joint surrogate model on all of the meta-data, we propose to learn individual surrogate models on the observations of each data set and then combine all surrogates to a joint one using ensembling techniques. The final surrogate is a weighted sum of all data set specific surrogates plus an additional surrogate that is solely learned on the target observations. Within our framework, any surrogate model can be used and explore Gaussian processes in this scenario. We present two different strategies for finding the weights used in the ensemble: the first is based on a probabilistic product of experts approach, and the second is based on kernel regression. Additionally, we extend the framework to directly estimate the acquisition function in the same setting, using a novel technique which we name the “transfer acquisition function”. In an empirical evaluation including comparisons to the current state-of-the-art on two publicly available meta-data sets, we are able to demonstrate that our proposed approach does not only scale to large meta-data, but also finds the stronger prediction models.

Journal Article

Share this book

Add to My Shelf

Feature-space selection with banded ridge regression

by Gallant, Jack L. , Eickenberg, Michael , Nunez-Elizalde, Anwar O. in Complementarity , Computational neuroscience , Decomposition

2022

•Using multiple feature spaces in a joint encoding model improves prediction accuracy.•The variance explained by the joint model can be decomposed over feature spaces.•Banded ridge regression optimizes the regularization for each feature space.•Banded ridge regression contains an implicit feature-space selection mechanism.•Banded ridge regression can be solved with random search or gradient descent. Encoding models provide a powerful framework to identify the information represented in brain recordings. In this framework, a stimulus representation is expressed within a feature space and is used in a regularized linear regression to predict brain activity. To account for a potential complementarity of different feature spaces, a joint model is fit on multiple feature spaces simultaneously. To adapt regularization strength to each feature space, ridge regression is extended to banded ridge regression, which optimizes a different regularization hyperparameter per feature space. The present paper proposes a method to decompose over feature spaces the variance explained by a banded ridge regression model. It also describes how banded ridge regression performs a feature-space selection, effectively ignoring non-predictive and redundant feature spaces. This feature-space selection leads to better prediction accuracy and to better interpretability. Banded ridge regression is then mathematically linked to a number of other regression methods with similar feature-space selection mechanisms. Finally, several methods are proposed to address the computational challenge of fitting banded ridge regressions on large numbers of voxels and feature spaces. All implementations are released in an open-source Python package called Himalaya.

Journal Article

Share this book

Add to My Shelf

Hyperparameter importance and optimization of quantum neural networks across small datasets

by van Rijn, Jan N. , Dunjko, Vedran , Moussa, Charles in Artificial Intelligence , Computer Science , Control

2024

As restricted quantum computers become available, research focuses on finding meaningful applications. For example, in quantum machine learning, a special type of quantum circuit called a quantum neural network is one of the most investigated approaches. However, we know little about suitable circuit architectures or important model hyperparameters for a given task. In this work, we apply the functional ANOVA framework to the quantum neural network architectures to analyze which of the quantum machine learning hyperparameters are most influential for their predictive performance. We restrict our study to 7 open-source datasets from the OpenML-CC18 classification benchmark, which are small enough for simulations on quantum hardware with fewer than 20 qubits. Using this framework, three main levels of importance were identified, confirming expected patterns and revealing new insights. For instance, the learning rate is identified as the most important hyperparameter on all datasets, whereas the particular choice of entangling gates used is found to be the least important on all except for one dataset. In addition to identifying the relevant hyperparameters, for each of them, we also learned data-driven priors based on values that perform well on previously seen datasets, which can then be used to steer hyperparameter optimization processes. We utilize these priors in the hyperparameter optimization method hyperband and show that these improve performance against uniform sampling across all datasets by, on average, 0.53 % , up to 6.11 % , in cross-validation accuracy. We also demonstrate that such improvements hold on average regardless of the configuration hyperband is run with. Our work introduces new methodologies for studying quantum machine learning models toward quantum model selection in practice. All research code is made publicly available.

Journal Article

Share this book

Add to My Shelf

Optimization of the Random Forest Hyperparameters for Power Industrial Control Systems Intrusion Detection Using an Improved Grid Search Algorithm

by Zhu, Ningyuan , Zhou, Liang , Zhang, Xiaojuan in Accuracy , Algorithms , Batch processing

2022

The intrusion detection method of power industrial control systems is a crucial aspect of assuring power security. However, traditional intrusion detection methods have two drawbacks: first, they are mainly used for defending information systems and lack the ability to detect attacks against power industrial control systems; and second, although machine learning-based intrusion detection methods perform well with the default hyperparameters, optimizing the hyperparameters can significantly improve its performance. In response to these limitations, a random forest (RF)-based intrusion detection model for power industrial control systems is proposed. Simultaneously, this paper proposes an improved grid search algorithm (IGSA) for optimizing the hyperparameters of the RF intrusion detection model to improve its efficiency and effectiveness. The proposed IGSA boosts the speed of calculation from O(nm) to O(n × m). The suggested model is evaluated based on the public power industrial control system dataset after hyperparameter optimization. The experiment results show that our method achieves a superior detection performance with the accuracy of 98% and has more outstanding performance than the same type of work.

Journal Article

Share this book

Add to My Shelf

Advanced hyperparameter optimization for improved spatial prediction of shallow landslides using extreme gradient boosting (XGBoost)

by Teke, Alihan , Kavzoglu, Taskin in Earth and Environmental Science , Earth Sciences , Foundations

2022

Machine learning algorithms have progressively become a part of landslide susceptibility mapping practices owing to their robustness in dealing with complicated and non-linear mechanisms of landslides. However, the internal structures of such algorithms contain a set of hyperparameter configurations whose correct setting is crucial to get the highest achievable performance. This current study investigates the effectiveness and robustness of advanced optimization algorithms, including random search (RS), Bayesian optimization with Gaussian Process (BO-GP), Bayesian optimization with Tree-structured Parzen Estimator (BO-TPE), genetic algorithm (GA), and Hyperband method, for optimizing the hyperparameters of the eXtreme Gradient Boosting (XGBoost) algorithm in the spatial prediction of landslides. 12 causative factors were considered to produce landslide susceptibility maps (LSMs) for the Trabzon province of Turkey, where translational shallow landslides are ubiquitous. Five accuracy metrics, including overall accuracy (OA), precision, recall, F1-score, area under the receiver operating characteristic curve (AUC), and a statistical significance test were employed to measure the effectiveness of the optimization strategies on XGBoost algorithm. Compared to the XGBoost model with default setting, the optimized models provided a significant improvement of up to 13% in terms of overall accuracy, which was also ascertained by McNemar’s test. AUC analysis revealed that having statistically similar performances, GA (0.942) and Hyperband (0.922) methods had the highest predictive abilities, followed by BO-GP (0.920), BO-TPE (0.899), and RS (0.894). Analysis of computational cost efficiency showed that the Hyperband approach (40.3 s) was much faster (about 13 times) than the GA in hyperparameter tuning, and thus appeared to be the best optimization algorithm for the problem under consideration.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter