Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
26
result(s) for
"Cyclic learning rate"
Sort by:
MSWNet: A visual deep machine learning method adopting transfer learning based upon ResNet 50 for municipal solid waste sorting
by
Cui, Feifei
,
Lin, Kunsen
,
Wang, Lina
in
artificial intelligence
,
Chemical composition
,
Chemical reactions
2023
● MSWNet was proposed to classify municipal solid waste. ● Transfer learning could promote the performance of MSWNet. ● Cyclical learning rate was adopted to quickly tune hyperparameters.
An intelligent and efficient methodology is needed owning to the continuous increase of global municipal solid waste (MSW). This is because the common methods of manual and semi-mechanical screenings not only consume large amount of manpower and material resources but also accelerate virus community transmission. As the categories of MSW are diverse considering their compositions, chemical reactions, and processing procedures, etc., resulting in low efficiencies in MSW sorting using the traditional methods. Deep machine learning can help MSW sorting becoming into a smarter and more efficient mode. This study for the first time applied MSWNet in MSW sorting, a ResNet-50 with transfer learning. The method of cyclical learning rate was taken to avoid blind finding, and tests were repeated until accidentally encountering a good value. Measures of visualization were also considered to make the MSWNet model more transparent and accountable. Results showed transfer learning enhanced the efficiency of training time (from 741 s to 598.5 s), and improved the accuracy of recognition performance (from 88.50% to 93.50%); MSWNet showed a better performance in MSW classsification in terms of sensitivity (93.50%), precision (93.40%), F1-score (93.40%), accuracy (93.50%) and AUC (92.00%). The findings of this study can be taken as a reference for building the model MSW classification by deep learning, quantifying a suitable learning rate, and changing the data from high dimensions to two dimensions.
Journal Article
Denigration analysis of Twitter data using cyclic learning rate based long short-term memory
Technological innovation has given rise to a new form of bullying, often leading to significant harm to one's reputation within social circles. When a single person becomes target to animosity and harassment in a cyberbullying incident, it is termed as denigration. Many different cyberbullying detection techniques are carried out to counter this, concentrating on word-based data and user account features only. The main objective of this research is to enhance the learning rate of long short-term memory (LSTM) using cyclic learning rate (CLR). Therefore, in this research, cyberbullying in social media is detected by developing a framework based on LSTM-CLR which is more stable for enhancing classification accuracy without the need for multiple trials and modifications. The effectiveness of the suggested LSTM-CLR is assessed for identifying cyberbullying using Twitter data. The attained results show that the proposed LSTM-CLR obtains 82% accuracy, 80% precision, 83% recall and 81% F-measure in the classification of cyberbullying tweets, which is superior when compared with the existing multilayer perceptron (MLP) and bidirectional encoder representations from transformers (BERT) models.
Journal Article
A Study on Cyclical Learning Rates in Reinforcement Learning and Its Application to Temperature and Power Consumption Control of Refrigeration System
by
Wang, Jingchen
,
Hashimoto, Seiji
,
Motegi, Kazuhiro
in
Algorithms
,
Climate change
,
Control systems
2024
In recent years, with the advancement of computer hardware technology, an increasing number of complex control systems have begun employing reinforcement learning over traditional PID controls to address the challenge of managing multiple outputs simultaneously. In this study, we have for the first time adopted the cyclical learning rate method, which is widely used in deep learning, and applied it to deep reinforcement learning. Utilizing MATLAB Simulink, a detailed simulation model was developed with the RSD-4TFK5J model refrigeration storage as the reference object. We precisely evaluated the effects of various cyclical learning rate strategies on the training process of the model. The simulation results demonstrate the effectiveness of the cyclical learning rate method during the training phase, showcasing its potential to enhance learning efficiency and system performance in complex control environments.
Journal Article
State of Charge Estimation of Lithium-Ion Batteries Employing Deep Neural Network with Variable Learning Rate
by
Madhavan Namboothiri, Kannan
,
K., Sundareswaran
,
Simon, Sishaj P
in
Accuracy
,
Algorithms
,
Artificial neural networks
2023
Deep learning (DL) has gained a lot of attention in the domain of estimating the State of Charge (SoC) of lithium-ion batteries used in electric vehicles (EVs). However, it is still challenging to develop an estimation model that is accurate, trustworthy, and low cost to compute. This research proposes a deep neural network (DNN) employing different learning rate optimization strategies. The proposed approach is compared with the conventional learning rate-based strategy. Further, existing and well-established neural networks, namely long short-term memory, bi-directional long short-term memory, and gated recurrent unit are employed and tested under identical conditions. The proposed architecture is trained and tested using different dynamic discharge profiles. The computational cost and the results of various performance metrics show the accuracy of the proposed approach.
Journal Article
Enhanced copy-move forgery detection using deep convolutional neural network (DCNN) employing the ResNet-101 transfer learning model
by
Vaishali, Sharma
,
Neetu, Singh
in
Computer Communication Networks
,
Computer Science
,
Data Structures and Information Theory
2024
The rapid proliferation of high-quality false images on social media sites calls for research on legitimate image recognition systems. Copy-move forgery (CMF), which involves copying portions of an image, is one of the most commonly used image altering methods. Due to the problem of exploding and vanishing gradients, the present Convolutional Neural Network (CNN) model must be trained for up to 100 epochs to achieve the greatest accuracy. In this work, a deep CNN (DCNN) model using the residual network with 101 deep layers has been used. In order to solve the problem of exploding and disappearing gradients, the concept of skip connections has been included in the residual network. In addition, in order to maximize the performance of the suggested ResNet-101 model, the cyclical learning rate (CLR) hyper-parameter is utilized to further tune the model. The model was trained and evaluated using a variety of datasets, including MICC-F600, MICC-F2000, MICC-F220, and CoMoFoD v2. Accuracy, error rate, true positive rate (TPR), false positive rate (FPR), true negative rate (TNR), and false negative rate (FNR) were analyzed quantitatively. The proposed model achieves highest accuracy of 97.75% only after training the model for 5 epochs only for CoMoFoD v2 dataset. For MICC-F220, MICC-F600 and MICC-F2000 datasets the achieved accuracy was 96.09%, 97.63% and 96.87% respectively only after training the model up to 10 epochs. In order to demonstrate the efficacy of the suggested approach, a comparative study with various state-of-the-art-models available in the literature has been presented.
Journal Article
Remote Sensing Scene Classification and Explanation Using RSSCNet and LIME
by
Wu, Hui-Ching
,
Hung, Sheng-Chieh
,
Tseng, Ming-Hseng
in
Accuracy
,
Classification
,
cyclical learning rate
2020
Classification is needed in disaster investigation, traffic control, and land-use resource management. How to quickly and accurately classify such remote sensing imagery has become a popular research topic. However, the application of large, deep neural network models for the training of classifiers in the hope of obtaining good classification results is often very time-consuming. In this study, a new CNN (convolutional neutral networks) architecture, i.e., RSSCNet (remote sensing scene classification network), with high generalization capability was designed. Moreover, a two-stage cyclical learning rate policy and the no-freezing transfer learning method were developed to speed up model training and enhance accuracy. In addition, the manifold learning t-SNE (t-distributed stochastic neighbor embedding) algorithm was used to verify the effectiveness of the proposed model, and the LIME (local interpretable model, agnostic explanation) algorithm was applied to improve the results in cases where the model made wrong predictions. Comparing the results of three publicly available datasets in this study with those obtained in previous studies, the experimental results show that the model and method proposed in this paper can achieve better scene classification more quickly and more efficiently.
Journal Article
An empirical study of cyclical learning rate on neural machine translation
2023
In training deep learning networks, the optimizer and related learning rate are often used without much thought or with minimal tuning, even though it is crucial in ensuring a fast convergence to a good quality minimum of the loss function that can also generalize well on the test dataset. Drawing inspiration from the successful application of cyclical learning rate policy to computer vision tasks, we explore how cyclical learning rate can be applied to train transformer-based neural networks for neural machine translation. From our carefully designed experiments, we show that the choice of optimizers and the associated cyclical learning rate policy can have a significant impact on the performance. In addition, we establish guidelines when applying cyclical learning rates to neural machine translation tasks.
Journal Article
Advanced Optimization Techniques for Federated Learning on Non-IID Data
by
Efthymiadis, Filippos
,
Sioutas, Spyros
,
Karras, Aristeidis
in
Accuracy
,
Algorithms
,
Artificial intelligence
2024
Federated learning enables model training on multiple clients locally, without the need to transfer their data to a central server, thus ensuring data privacy. In this paper, we investigate the impact of Non-Independent and Identically Distributed (non-IID) data on the performance of federated training, where we find a reduction in accuracy of up to 29% for neural networks trained in environments with skewed non-IID data. Two optimization strategies are presented to address this issue. The first strategy focuses on applying a cyclical learning rate to determine the learning rate during federated training, while the second strategy develops a sharing and pre-training method on augmented data in order to improve the efficiency of the algorithm in the case of non-IID data. By combining these two methods, experiments show that the accuracy on the CIFAR-10 dataset increased by about 36% while achieving faster convergence by reducing the number of required communication rounds by 5.33 times. The proposed techniques lead to improved accuracy and faster model convergence, thus representing a significant advance in the field of federated learning and facilitating its application to real-world scenarios.
Journal Article
Aerial Scene Classification through Fine-Tuning with Adaptive Learning Rates and Label Smoothing
by
Atanasova-Pacemska, Tatjana
,
Mignone, Paolo
,
Corizzo, Roberto
in
convolutional neural network
,
cyclical learning rates
,
fine-tuning
2020
Remote Sensing (RS) image classification has recently attracted great attention for its application in different tasks, including environmental monitoring, battlefield surveillance, and geospatial object detection. The best practices for these tasks often involve transfer learning from pre-trained Convolutional Neural Networks (CNNs). A common approach in the literature is employing CNNs for feature extraction, and subsequently train classifiers exploiting such features. In this paper, we propose the adoption of transfer learning by fine-tuning pre-trained CNNs for end-to-end aerial image classification. Our approach performs feature extraction from the fine-tuned neural networks and remote sensing image classification with a Support Vector Machine (SVM) model with linear and Radial Basis Function (RBF) kernels. To tune the learning rate hyperparameter, we employ a linear decay learning rate scheduler as well as cyclical learning rates. Moreover, in order to mitigate the overfitting problem of pre-trained models, we apply label smoothing regularization. For the fine-tuning and feature extraction process, we adopt the Inception-v3 and Xception inception-based CNNs, as well the residual-based networks ResNet50 and DenseNet121. We present extensive experiments on two real-world remote sensing image datasets: AID and NWPU-RESISC45. The results show that the proposed method exhibits classification accuracy of up to 98%, outperforming other state-of-the-art methods.
Journal Article
Combination of Optimization Methods in a Multistage Approach for a Deep Neural Network Model
by
Zubair, Swaleha
,
Singha, Anjani Kumar
in
Algorithms
,
Artificial Intelligence
,
Artificial neural networks
2024
This paper on gradient descent (GD) lies at the heart and soul of neural networks. The development of GD optimization algorithms significantly sped up the advancement of deep learning. Gradient Descent methods focus on deep learning research; some research projects have attempted to mix multiple training approaches to improve network performance; moreover, these methods seem primarily practical and need more theoretical guidance. This paper develops an architecture to demonstrate the combination of various GD optimization methodologies by analyzing other learning rates and numerous adaptive methods. This research aims to show how to apply GD to different optimization methods in Multistage into the GD optimization approach for exploring a deep learning model using a GD optimization method mixing technique. This research was motivated by the principles of SGDR (stochastic gradient descent with warm restarts), warm-up, and CLR (cyclical learning rates). The results of the training tests with the huge deep learning network validate the efficiency of the technique. This experiment is done by google colab python.
Journal Article