Catalogue Search | MBRL

A novel approach to workload prediction using attention-based LSTM encoder-decoder network in cloud environment

by Chen, Yihai , Zhu, Yonghua , Gao, Honghao in Coders , Encoders-Decoders , Workload

2019

Server workload in the form of cloud-end clusters is a key factor in server maintenance and task scheduling. How to balance and optimize hardware resources and computation resources should thus receive more attention. However, we have observed that the disordered execution of running application and batching seriously cuts down the efficiency of the server. To improve the workload prediction accuracy, this paper proposes an approach using the long short-term memory (LSTM) encoder-decoder network with attention mechanism. First, the approach extracts the sequential and contextual features of the historical workload data through the encoder network. Second, the model integrates the attention mechanism into the decoder network, through which the prediction for batch workloads can be carried out. Third, experiments carried out on Alibaba and Dinda workload traces dataset demonstrate that our method achieves state-of-the-art performance in mixed workload prediction in cloud computing environment. Furthermore, we also propose a scroll prediction method, which splits a long prediction sequence into several small sequences to monitor and control prediction accuracy. This work helps to dynamically guide the configuration for workload balancing.

Journal Article

Share this book

Add to My Shelf

REDfold: accurate RNA secondary structure prediction using residual encoder-decoder network

by Chen, Chun-Chi , Chan, Yi-Ming in Accuracy , Algorithms , Base Sequence

2023

Background As the RNA secondary structure is highly related to its stability and functions, the structure prediction is of great value to biological research. The traditional computational prediction for RNA secondary prediction is mainly based on the thermodynamic model with dynamic programming to find the optimal structure. However, the prediction performance based on the traditional approach is unsatisfactory for further research. Besides, the computational complexity of the structure prediction using dynamic programming is O ( N 3 ) ; it becomes O ( N 6 ) for RNA structure with pseudoknots, which is computationally impractical for large-scale analysis. Results In this paper, we propose REDfold, a novel deep learning-based method for RNA secondary prediction. REDfold utilizes an encoder-decoder network based on CNN to learn the short and long range dependencies among the RNA sequence, and the network is further integrated with symmetric skip connections to efficiently propagate activation information across layers. Moreover, the network output is post-processed with constrained optimization to yield favorable predictions even for RNAs with pseudoknots. Experimental results based on the ncRNA database demonstrate that REDfold achieves better performance in terms of efficiency and accuracy, outperforming the contemporary state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

Deep-Learning Forecasting Method for Electric Power Load via Attention-Based Encoder-Decoder with Bayesian Optimization

by Jin, Xue-Bo , Su, Ting-Li , Lin, Seng in Bayesian optimization , deep-learning encoder-decoder framework , electric power load prediction

2021

Short-term electrical load forecasting plays an important role in the safety, stability, and sustainability of the power production and scheduling process. An accurate prediction of power load can provide a reliable decision for power system management. To solve the limitation of the existing load forecasting methods in dealing with time-series data, causing the poor stability and non-ideal forecasting accuracy, this paper proposed an attention-based encoder-decoder network with Bayesian optimization to do the accurate short-term power load forecasting. Proposed model is based on an encoder-decoder architecture with a gated recurrent units (GRU) recurrent neural network with high robustness on time-series data modeling. The temporal attention layer focuses on the key features of input data that play a vital role in promoting the prediction accuracy for load forecasting. Finally, the Bayesian optimization method is used to confirm the model’s hyperparameters to achieve optimal predictions. The verification experiments of 24 h load forecasting with real power load data from American Electric Power (AEP) show that the proposed model outperforms other models in terms of prediction accuracy and algorithm stability, providing an effective approach for migrating time-serial power load prediction by deep-learning technology.

Journal Article

Share this book

Add to My Shelf

A Transformer-Based Bridge Structural Response Prediction Framework

by Li, Dongsheng , Li, Ziqi , Sun, Tianshu in Accuracy , bridge structural response prediction , Bridges

2022

Structural response prediction with desirable accuracy is considerably essential for the health monitoring of bridges. However, it appears to be difficult in accurately extracting structural response features on account of complex on-site environment and noise disturbance, resulting in poor prediction accuracy of the response values. To address this issue, a Transformer-based bridge structural response prediction framework was proposed in this paper. The framework contains multi-layer encoder modules and attention modules that can precisely capture the history-dependent features in time-series data. The effectiveness of the proposed method was validated with the use of six-month strain response data of a concrete bridge, and the results are also compared with those of the most commonly used Long Short-Term Memory (LSTM)-based structural response prediction framework. The analysis indicated that the proposed method was effective in predicting structural response, with the prediction error less than 50% of the LSTM-based framework. The proposed method can be applied in damage diagnosis and disaster warning of bridges.

Journal Article

Share this book

Add to My Shelf

Building Extraction of Aerial Images by a Global and Multi-Scale Encoder-Decoder Network

by Wu, Linlin , Ma, Jingjing , Tang, Xu in aerial image , Algorithms , building extraction

2020

Semantic segmentation is an important and challenging task in the aerial image community since it can extract the target level information for understanding the aerial image. As a practical application of aerial image semantic segmentation, building extraction always attracts researchers’ attention as the building is the specific land cover in the aerial images. There are two key points for building extraction from aerial images. One is learning the global and local features to fully describe the buildings with diverse shapes. The other one is mining the multi-scale information to discover the buildings with different resolutions. Taking these two key points into account, we propose a new method named global multi-scale encoder-decoder network (GMEDN) in this paper. Based on the encoder-decoder framework, GMEDN is developed with a local and global encoder and a distilling decoder. The local and global encoder aims at learning the representative features from the aerial images for describing the buildings, while the distilling decoder focuses on exploring the multi-scale information for the final segmentation masks. Combining them together, the building extraction is accomplished in an end-to-end manner. The effectiveness of our method is validated by the experiments counted on two public aerial image datasets. Compared with some existing methods, our model can achieve better performance.

Journal Article

Share this book

Add to My Shelf

SpikeSegNet-a deep learning approach utilizing encoder-decoder network with hourglass for spike segmentation and counting in wheat plant from visual imaging

by Sahoo, Rabi Narayan , Jain, Rajni , Arora, Alka in Accuracy , Agricultural production , Artificial neural networks

2020

Background High throughput non-destructive phenotyping is emerging as a significant approach for phenotyping germplasm and breeding populations for the identification of superior donors, elite lines, and QTLs. Detection and counting of spikes, the grain bearing organs of wheat, is critical for phenomics of a large set of germplasm and breeding lines in controlled and field conditions. It is also required for precision agriculture where the application of nitrogen, water, and other inputs at this critical stage is necessary. Further, counting of spikes is an important measure to determine yield. Digital image analysis and machine learning techniques play an essential role in non-destructive plant phenotyping analysis. Results In this study, an approach based on computer vision, particularly object detection, to recognize and count the number of spikes of the wheat plant from the digital images is proposed. For spike identification, a novel deep-learning network, SpikeSegNet, has been developed by combining two proposed feature networks: Local Patch extraction Network (LPNet) and Global Mask refinement Network (GMRNet). In LPNet, the contextual and spatial features are learned at the local patch level. The output of LPNet is a segmented mask image, which is further refined at the global level using GMRNet. Visual (RGB) images of 200 wheat plants were captured using LemnaTec imaging system installed at Nanaji Deshmukh Plant Phenomics Centre, ICAR-IARI, New Delhi. The precision, accuracy, and robustness (F 1 score) of the proposed approach for spike segmentation are found to be 99.93%, 99.91%, and 99.91%, respectively. For counting the number of spikes, “analyse particles”—function of imageJ was applied on the output image of the proposed SpikeSegNet model. For spike counting, the average precision, accuracy, and robustness are 99%, 95%, and 97%, respectively. SpikeSegNet approach is tested for robustness with illuminated image dataset, and no significant difference is observed in the segmentation performance. Conclusion In this study, a new approach called as SpikeSegNet has been proposed based on combined digital image analysis and deep learning techniques. A dedicated deep learning approach has been developed to identify and count spikes in the wheat plants. The performance of the approach demonstrates that SpikeSegNet is an effective and robust approach for spike detection and counting. As detection and counting of wheat spikes are closely related to the crop yield, and the proposed approach is also non-destructive, it is a significant step forward in the area of non-destructive and high-throughput phenotyping of wheat.

Journal Article

Share this book

Add to My Shelf

Fast Adaptive RNN Encoder–Decoder for Anomaly Detection in SMD Assembly Machine

by Park, YeongHyeon , Yun, Il Dong in anomaly detection , Communication , fast adaptation

2018

Surface Mounted Device (SMD) assembly machine manufactures various products on a flexible manufacturing line. An anomaly detection model that can adapt to the various manufacturing environments very fast is required. In this paper, we proposed a fast adaptive anomaly detection model based on a Recurrent Neural Network (RNN) Encoder–Decoder with operating machine sounds. RNN Encoder–Decoder has a structure very similar to Auto-Encoder (AE), but the former has significantly reduced parameters compared to the latter because of its rolled structure. Thus, the RNN Encoder–Decoder only requires a short training process for fast adaptation. The anomaly detection model decides abnormality based on Euclidean distance between generated sequences and observed sequence from machine sounds. Experimental evaluation was conducted on a set of dataset from the SMD assembly machine. Results showed cutting-edge performance with fast adaptation.

Journal Article

Share this book

Add to My Shelf

Gated RedNet for image denoising

by Huibin Zhang , Jing Li , Liping Feng in Encoder-decoder , GatedRedNet , Image denoising

2026

Deep learning has garnered widespread attention in the computer vision field, particularly for image denoising tasks. While Transformer-based models currently represent the state-of-the-art in denoising performance, this work investigates whether the architecture based on convolutional neural networks (CNN) can achieve comparable results. We propose GatedRedNet, a novel three-layer encoder-decoder network designed for image denoising. To enhance network depth while maintaining efficiency, GatedRedNet incorporates attention mechanisms within its residual blocks. Specifically, we incorporate the Efficient Channel Attention (ECA) module within all residual blocks, enabling effective recovery of edge and texture details with minimal computational overhead. To further improve global feature learning, we introduce a learnable gating factor at the encoder-decoder fusion stage, dynamically adjusting the weight ratio between their feature maps. These designs enable GatedRedNet to better capture local texture features while maintaining robust denoising capabilities. We evaluate GatedRedNet extensively on multiple benchmark datasets, demonstrating denoising performance comparable to state-of-the-art Transformer-based models and even surpassing them in certain aspects. Our results suggest that well-designed CNN architectures remain competitive in image denoising, offering an efficient alternative to Transformer.

Journal Article

Share this book

Add to My Shelf

GVA: guided visual attention approach for automatic image caption generation

by Hossen, Md. Bipul , Abdussalam, Amr , Ye, Zhongfu in Artificial neural networks , Computer Communication Networks , Computer Graphics

2024

Automated image caption generation with attention mechanisms focuses on visual features including objects, attributes, actions, and scenes of the image to understand and provide more detailed captions, which attains great attention in the multimedia field. However, deciding which aspects of an image to highlight for better captioning remains a challenge. Most advanced captioning models utilize only one attention module to assign attention weights to visual vectors, but this may not be enough to create an informative caption. To tackle this issue, we propose an innovative and well-designed Guided Visual Attention (GVA) approach, incorporating an additional attention mechanism to re-adjust the attentional weights on the visual feature vectors and feed the resulting context vector to the language LSTM. Utilizing the first-level attention module as guidance for the GVA module and re-weighting the attention weights significantly enhances the caption’s quality. Recently, deep neural networks have allowed the encoder-decoder architecture to make use visual attention mechanism, where faster R-CNN is used for extracting features in the encoder and a visual attention-based LSTM is applied in the decoder. Extensive experiments have been implemented on both the MS-COCO and Flickr30k benchmark datasets. Compared with state-of-the-art methods, our approach achieved an average improvement of 2.4% on BLEU@1 and 13.24% on CIDEr for the MSCOCO dataset, as well as 4.6% on BLEU@1 and 12.48% on CIDEr score for the Flickr30K datasets, based on the cross-entropy optimization. These results demonstrate the clear superiority of our proposed approach in comparison to existing methods using standard evaluation metrics. The implementing code can be found here: ( https://github.com/mdbipu/GVA ).

Journal Article

Share this book

Add to My Shelf

Enhancing Scene Text Recognition with Encoder–Decoder Interactive Model

by Mu, Yongbin , Maimaiti, Mieradilijiang , Li, Wenkai in Datasets , Deep learning , encoder-decoder interactive model

2025

Scene text recognition has significant application value in autonomous driving, smart retail, and assistive devices. However, due to challenges such as multi-scale variations, distortions, and complex backgrounds, existing methods such as CRNN, ViT, and PARSeq, while showing good performance, still have room for improvement in feature extraction and semantic modeling capabilities. To address these issues, this paper proposes a novel scene text recognition model named the Encoder–Decoder Interactive Model (EDIM). Based on an encoder–decoder framework, EDIM introduces a Multi-scale Dilated Fusion Attention (MSFA) module in the encoder to enhance multi-scale feature representation. In the decoder, a Sequential Encoder–Decoder Context Fusion (SeqEDCF) mechanism is designed to enable efficient semantic interaction between the encoder and decoder. The effectiveness of the proposed method is validated on six regular and irregular benchmark test sets, as well as various subsets of the Union14M-L dataset. Experimental results demonstrate that EDIM outperforms state-of-the-art (SOTA) methods across multiple metrics, achieving significant performance gains, especially in recognizing irregular and distorted text.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter