Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
423
result(s) for
"encoder-decoder architecture"
Sort by:
A Domain-Specific Generative Chatbot Trained from Little Data
2020
Accurate generative chatbots are usually trained on large datasets of question–answer pairs. Despite such datasets not existing for some languages, it does not reduce the need for companies to have chatbot technology in their websites. However, companies usually own small domain-specific datasets (at least in the form of an FAQ) about their products, services, or used technologies. In this research, we seek effective solutions to create generative seq2seq-based chatbots from very small data. Since experiments are carried out in English and morphologically complex Lithuanian languages, we have an opportunity to compare results for languages with very different characteristics. We experimentally explore three encoder–decoder LSTM-based approaches (simple LSTM, stacked LSTM, and BiLSTM), three word embedding types (one-hot encoding, fastText, and BERT embeddings), and five encoder–decoder architectures based on different encoder and decoder vectorization units. Furthermore, all offered approaches are applied to the pre-processed datasets with removed and separated punctuation. The experimental investigation revealed the advantages of the stacked LSTM and BiLSTM encoder architectures and BERT embedding vectorization (especially for the encoder). The best achieved BLUE on English/Lithuanian datasets with removed and separated punctuation was ~0.513/~0.505 and ~0.488/~0.439, respectively. Better results were achieved with the English language, because generating different inflection forms for the morphologically complex Lithuanian is a harder task. The BLUE scores fell into the range defining the quality of the generated answers as good or very good for both languages. This research was performed with very small datasets having little variety in covered topics, which makes this research not only more difficult, but also more interesting. Moreover, to our knowledge, it is the first attempt to train generative chatbots for a morphologically complex language.
Journal Article
Residual quadratic encoder–decoder architecture for semantic segmentation of satellite images
2025
Semantic segmentation is used for identification of buildings, roads, vegetation cover, and water body detection in satellite images. Several state-of-the-art deep learning models investigated have a large number of parameters and are difficult to train on low-configuration machines. To resolve this issue, quadratic encoder–decoder (QuadED), and residual quadratic encoder–decoder (ResQuadED) enabled with quadratic convolutional (QuadConv2D) layer are proposed as novel architectures in this manuscript. In the experiments, it is observed that QuadED performs better for binary semantic segmentation and ResQuadED performs better for both binary and multi-class semantic segmentation. As a result, the QuadED achieves an IoU of 76.82%, and 87.57% for the architectures with 5,51,913 and 4,90,088 parameters on the MBRSC and Massachusetts datasets, respectively. The ResQuadED model achieves an IoU of 77.87%, and 89.27% for the architectures with 5,37,585 and 4,54,682 parameters on the MBRSC and Massachusetts datasets, respectively. These results reveal that ResQuadED architecture performs significantly better for the evaluation metrics with fewer parameters than QuadED and existing architectures.
Journal Article
Semantic road segmentation using encoder-decoder architectures
by
Sharma, Sanjeev
,
Latsaheb, Burhanuddin
,
Hasija, Sanskar
in
Autonomous navigation
,
Computer Communication Networks
,
Computer Science
2025
Road detection is a fundamental task in autonomous driving, making accurate and efficient road area segmentation essential for the safe and precise navigation of autonomous vehicles. This paper proposes various models for road segmentation, employing an encoder-decoder architecture for fully automatic segmentation of road areas. As part of the encoder, this work explores different models, such as ResNet50V2, DenseNet121, DenseNet169, and DenseNet201, and utilizes them in one of the few dedicated methods for road area segmentation. Here, the dataset, derived from the Mapillary Vistas Dataset, has been meticulously pre-processed to convert it into a binary segmentation problem for road detection, comprising 8041 training images and 919 validation images with their respective masks. The models were trained on our dataset, achieving the highest Dice coefficient value of 99.61% on the training dataset and 93.85% on the validation dataset using the DenseNet169 encoder model. This research contributes to advancing the state-of-the-art in road segmentation for autonomous driving applications.
Journal Article
Adaptive Feature Medical Segmentation Network: an adaptable deep learning paradigm for high-performance 3D brain lesion segmentation in medical imaging
by
Hassan, Haseeb
,
Miao, Xiaoqiang
,
Huang, Bingding
in
adaptive feature extraction
,
attention mechanism
,
brain lesion segmentation
2024
In neurological diagnostics, accurate detection and segmentation of brain lesions is crucial. Identifying these lesions is challenging due to its complex morphology, especially when using traditional methods. Conventional methods are either computationally demanding with a marginal impact/enhancement or sacrifice fine details for computational efficiency. Therefore, balancing performance and precision in compute-intensive medical imaging remains a hot research topic.
We introduce a novel encoder-decoder network architecture named the Adaptive Feature Medical Segmentation Network (AFMS-Net) with two encoder variants: the Single Adaptive Encoder Block (SAEB) and the Dual Adaptive Encoder Block (DAEB). A squeeze-and-excite mechanism is employed in SAEB to identify significant data while disregarding peripheral details. This approach is best suited for scenarios requiring quick and efficient segmentation, with an emphasis on identifying key lesion areas. In contrast, the DAEB utilizes an advanced channel spatial attention strategy for fine-grained delineation and multiple-class classifications. Additionally, both architectures incorporate a Segmentation Path (SegPath) module between the encoder and decoder, refining segmentation, enhancing feature extraction, and improving model performance and stability.
AFMS-Net demonstrates exceptional performance across several notable datasets, including BRATs 2021, ATLAS 2021, and ISLES 2022. Its design aims to construct a lightweight architecture capable of handling complex segmentation challenges with high precision.
The proposed AFMS-Net addresses the critical balance issue between performance and computational efficiency in the segmentation of brain lesions. By introducing two tailored encoder variants, the network adapts to varying requirements of speed and feature. This approach not only advances the state-of-the-art in lesion segmentation but also provides a scalable framework for future research in medical image processing.
Journal Article
The exploration of a Temporal Convolutional Network combined with Encoder-Decoder framework for runoff forecasting
by
Xu, Chong-Yu
,
Guo, Shenglian
,
Lin, Kangling
in
artificial neural network
,
Coders
,
Concentration time
2020
The Temporal Convolutional Network (TCN) and TCN combined with the Encoder-Decoder architecture (TCN-ED) are proposed to forecast runoff in this study. Both models are trained and tested using the hourly data in the Jianxi basin, China. The results indicate that the forecast horizon has a great impact on the forecast ability, and the concentration time of the basin is a critical threshold to the effective forecast horizon for both models. Both models perform poorly in the low flow and well in the medium and high flow at most forecast horizons, while it is subject to the forecast horizon in forecasting peak flow. TCN-ED has better performance than TCN in runoff forecasting, with higher accuracy, better stability, and insensitivity to fluctuations in the rainfall process. Therefore, TCN-ED is an effective deep learning solution in runoff forecasting within an appropriate forecast horizon.
Journal Article
Video description: A comprehensive survey of deep learning approaches
2023
Video description refers to understanding visual content and transforming that acquired understanding into automatic textual narration. It bridges the key AI fields of computer vision and natural language processing in conjunction with real-time and practical applications. Deep learning-based approaches employed for video description have demonstrated enhanced results compared to conventional approaches. The current literature lacks a thorough interpretation of the recently developed and employed sequence to sequence techniques for video description. This paper fills that gap by focusing mainly on deep learning-enabled approaches to automatic caption generation. Sequence to sequence models follow an Encoder–Decoder architecture employing a specific composition of CNN, RNN, or the variants LSTM or GRU as an encoder and decoder block. This standard-architecture can be fused with an attention mechanism to focus on a specific distinctiveness, achieving high quality results. Reinforcement learning employed within the Encoder–Decoder structure can progressively deliver state-of-the-art captions by following exploration and exploitation strategies. The transformer mechanism is a modern and efficient transductive architecture for robust output. Free from recurrence, and solely based on self-attention, it allows parallelization along with training on a massive amount of data. It can fully utilize the available GPUs for most NLP tasks. Recently, with the emergence of several versions of transformers, long term dependency handling is not an issue anymore for researchers engaged in video processing for summarization and description, or for autonomous-vehicle, surveillance, and instructional purposes. They can get auspicious directions from this research.
Journal Article
Coarse-to-Fine Satellite Images Change Detection Framework via Boundary-Aware Attentive Network
2020
Timely and accurate change detection on satellite images by using computer vision techniques has been attracting lots of research efforts in recent years. Existing approaches based on deep learning frameworks have achieved good performance for the task of change detection on satellite images. However, under the scenario of disjoint changed areas in various shapes on land surface, existing methods still have shortcomings in detecting all changed areas correctly and representing the changed areas boundary. To deal with these problems, we design a coarse-to-fine detection framework via a boundary-aware attentive network with a hybrid loss to detect the change in high resolution satellite images. Specifically, we first perform an attention guided encoder-decoder subnet to obtain the coarse change map of the bi-temporal image pairs, and then apply residual learning to obtain the refined change map. We also propose a hybrid loss to provide the supervision from pixel, patch, and map levels. Comprehensive experiments are conducted on two benchmark datasets: LEBEDEV and SZTAKI to verify the effectiveness of the proposed method and the experimental results show that our model achieves state-of-the-art performance.
Journal Article
Deep Learning-Based Feature Silencing for Accurate Concrete Crack Detection
by
La, Hung Manh
,
Tavakkoli, Alireza
,
Billah, Umme Hafsa
in
convolutional neural network
,
crack detection
,
encoder-decoder architecture
2020
An autonomous concrete crack inspection system is necessary for preventing hazardous incidents arising from deteriorated concrete surfaces. In this paper, we present a concrete crack detection framework to aid the process of automated inspection. The proposed approach employs a deep convolutional neural network architecture for crack segmentation, while addressing the effect of gradient vanishing problem. A feature silencing module is incorporated in the proposed framework, capable of eliminating non-discriminative feature maps from the network to improve performance. Experimental results support the benefit of incorporating feature silencing within a convolutional neural network architecture for improving the network’s robustness, sensitivity, and specificity. An added benefit of the proposed architecture is its ability to accommodate for the trade-off between specificity (positive class detection accuracy) and sensitivity (negative class detection accuracy) with respect to the target application. Furthermore, the proposed framework achieves a high precision rate and processing time than the state-of-the-art crack detection architectures.
Journal Article
Enhanced remaining useful life prediction of lithium-ion battery based on a dual attention hybrid data-driven method
by
Ibrahim, AL-Wesabi
,
Al-Shamma’a, Abdullrahman A.
,
Shi, Zhenglu
in
639/166
,
639/166/987
,
And electric vehicles
2026
Electric vehicles (EVs) and wheelchairs rely heavily on lithium-ion batteries (LIBs) for safe, reliable, and uninterrupted mobility. Accurate Remaining Useful Life (RUL) prediction of EV and wheelchair batteries is paramount for averting unexpected failures, extending service longevity, and devising proactive maintenance protocols. Although bidirectional gated recurrent unit (BiGRU) networks are proficient at modeling temporal dependencies, their exclusive dependence on sequential input constrains prediction precision. To circumvent this issue, this study introduces an encoder–decoder architecture that synergistically integrates dual attention mechanisms with BiGRU. Within the encoder, an attention-augmented BiGRU module selectively emphasizes critical aging features. An additional attention mechanism dynamically refines the input sequence to ensure no essential information is attenuated during encoding. The decoder, structured as a BiGRU network, subsequently delineates precise capacity degradation trajectories. Comprehensive evaluations conducted on the NASA and CALCE datasets substantiate the model efficacy. On the NASA dataset, the proposed framework yields absolute errors of 1, 4, 0, and 2 for cells B0005, B0006, B0007, and B0018, respectively. Comparably minimal error margins are observed on the CALCE dataset. Further comparative analyses reveal that the proposed approach surpasses existing benchmarks in predictive accuracy, robustness, and generalizability. This high-fidelity RUL prediction model holds substantial potential for advancing battery management systems in EVs and wheelchairs, reinforcing operational reliability, and elevating safety and end-user experience.
Journal Article
IndiVNet A region adaptive semantic image segmentation for autonomous driving in unstructured environments
by
Bandyopadhyay, Anjan
,
Chakraborty, Pritam
,
Platos, Jan
in
639/166
,
639/705
,
Autonomous vehicles
2025
Autonomous navigation in developing regions is challenged by heterogeneous traffic, dynamic occlusions, and weak road structure. Existing segmentation models, largely trained on structured Western datasets, struggle to generalize under these conditions. To address this gap, we propose IndiVNet, a semantic segmentation architecture tailored for unstructured Indian driving environments. IndiVNet introduces a progressive dilation encoder (6
16) that captures fine-grained details and broad contextual cues without inducing oversparsity. Evaluated on the India Driving Dataset (IDD), it achieves 69.98% mIoU, outperforming CNN and Transformer baselines, and reaches 73.2% mIoU on CAMVID, demonstrating strong cross-domain generalization. By combining contextual adaptability with real-time efficiency, IndiVNet offers a scalable, region-aware solution for robust autonomous navigation in complex environments.
Journal Article