Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
6
result(s) for
"Adaptive gate feature fusion"
Sort by:
YOLOv5_(m)amba: unmanned aerial vehicle object detection based on bidirectional dense feedback network and adaptive gate feature fusion
by
Guo, Chengcheng
,
Wu, Shixiao
,
Lu, Xingyuan
in
Adaptive gate feature fusion
,
Mamba
,
Object detection
2024
Addressing the problem that the object size in Unmanned Aerial Vehicles (UAVs) aerial images is too small and contains limited feature information, leading to existing detection algorithms having less than ideal performance in small object detection, we propose a UAV aerial object detection system named YOLv5_mamba based on bidirectional dense feedback network and adaptive gate feature fusion. This paper improves the You Only Look Once Version 5 (YOLOv5) algorithm by firstly introducing the Faster Implementation of CSP Bottleneck with 2 convolutions (C2f) module from YOLOv8 into the backbone network to enhance the feature extraction capability of the backbone network. Furthermore, the mamba module and C2f module are introduced to construct a bidirectional dense feedback network to enhance the transfer of contextual information in the neck part. Thirdly, an adaptive gate feature fusion network is proposed to improve the head part of YOLOv5 and enhance its final detection capability. Experimental results on the public UAV aerial dataset VisDrone2019 demonstrate that the proposed algorithm improves the detection accuracy by 9.3% compared to the original YOLOv5 baseline network, showing better detection performance for small objects. For the UCAS_AOD dataset, the proposed algorithm outperforms YOLOv5-s by 9%. In the case of the DIOR dataset, the proposed algorithm exceeds YOLOv5-s by 12%.
Journal Article
YOLOv5_mamba: unmanned aerial vehicle object detection based on bidirectional dense feedback network and adaptive gate feature fusion
2024
Addressing the problem that the object size in Unmanned Aerial Vehicles (UAVs) aerial images is too small and contains limited feature information, leading to existing detection algorithms having less than ideal performance in small object detection, we propose a UAV aerial object detection system named YOLv5_mamba based on bidirectional dense feedback network and adaptive gate feature fusion. This paper improves the You Only Look Once Version 5 (YOLOv5) algorithm by firstly introducing the Faster Implementation of CSP Bottleneck with 2 convolutions (C2f) module from YOLOv8 into the backbone network to enhance the feature extraction capability of the backbone network. Furthermore, the mamba module and C2f module are introduced to construct a bidirectional dense feedback network to enhance the transfer of contextual information in the neck part. Thirdly, an adaptive gate feature fusion network is proposed to improve the head part of YOLOv5 and enhance its final detection capability. Experimental results on the public UAV aerial dataset VisDrone2019 demonstrate that the proposed algorithm improves the detection accuracy by 9.3% compared to the original YOLOv5 baseline network, showing better detection performance for small objects. For the UCAS_AOD dataset, the proposed algorithm outperforms YOLOv5-s by 9%. In the case of the DIOR dataset, the proposed algorithm exceeds YOLOv5-s by 12%.
Journal Article
A dual-domain perception gate-controlled adaptive fusion algorithm for road crack detection
2025
Road crack detection presents critical challenges, including diverse defect patterns and complex anomaly characteristics. The current object detection algorithms demonstrate deficiencies in considering feature redundancy across channel-spatial dimensions, employ indiscriminate fusion strategies for multi-stage feature information, and particularly neglect the high-frequency characteristics inherent in crack features, leading to inefficient network performance and a loss of crucial information. Building upon the identified limitations, this paper proposes a dual-domain perception gate-controlled adaptive fusion network (DP-DETR) that achieves dynamic perception of salient features across channel and spatial domains within latent space. To enhance focus on critical features, a dual-domain dynamic perception information distillation mechanism is constructed, which distills redundant features separately across channel and spatial domains, effectively reducing architectural processing redundancy while achieving discriminative characteristic representation efficiency. In order to address the challenge of coarse-grained fusion in multi-stage feature integration, a feature information gating-adaptive fusion module (FGAF-Fusion) is proposed, which facilitates interactive channel-spatial information fusion through mixed local channel attention while employing gated adaptive fusion operations to selectively retain critical semantic information of small-scale targets. In response to the persistent high-frequency signature identified within crack feature distributions, a dual-domain structural feature enhancement loss function is designed, which elevates the weighting of high-frequency information by leveraging a spectral weighting matrix, while complementarily enhancing crack edge texture features in the spatial domain through gradient map integration. The experimental results obtained on the public RDD2022 dataset demonstrate that the proposed DP-DETR (Dual-Domain Perception Gate-Controlled Adaptive Fusion Network) approach mAP50 and mAP50:95 values of 54.2% and 25.8%, respectively, representing improvements of 6.7 and 4.2 percentage points over RT-DETR. In road crack object detection tasks, the proposed DP-DETR method can effectively detect various types of road defects, demonstrating highly competitive detection results and good robustness. The code will be released at
https://github.com/jiangsu415/DP-DETR
.
Journal Article
MSGFNet: Multi-Scale Gated Fusion Network for Remote Sensing Image Change Detection
2024
Change detection (CD) stands out as a pivotal yet challenging task in the interpretation of remote sensing images. Significant developments have been witnessed, particularly with the rapid advancements in deep learning techniques. Nevertheless, challenges such as incomplete detection targets and unsmooth boundaries remain as most CD methods suffer from ineffective feature fusion. Therefore, this paper presents a multi-scale gated fusion network (MSGFNet) to improve the accuracy of CD results. To effectively extract bi-temporal features, the EfficientNetB4 model based on a Siamese network is employed. Subsequently, we propose a multi-scale gated fusion module (MSGFM) that comprises a multi-scale progressive fusion (MSPF) unit and a gated weight adaptive fusion (GWAF) unit, aimed at fusing bi-temporal multi-scale features to maintain boundary details and detect completely changed targets. Finally, we use the simple yet efficient UNet structure to recover the feature maps and predict results. To demonstrate the effectiveness of the MSGFNet, the LEVIR-CD, WHU-CD, and SYSU-CD datasets were utilized, and the MSGFNet achieved F1 scores of 90.86%, 92.46%, and 80.39% on the three datasets, respectively. Furthermore, the low computational costs and small model size have validated the superior performance of the MSGFNet.
Journal Article
Sub-pixel multi-scale fusion network for medical image segmentation
by
Fang, Xian
,
Li, Jing
,
Chen, Qiaohong
in
1237: Advanced Deep Learning for Computer Vision and Multimedia Applications
,
Accuracy
,
Artificial neural networks
2024
CNNs and Transformers have significantly advanced the domain of medical image segmentation. The integration of their strengths facilitates rich feature extraction but also introduces the challenge of mixed multi-scale feature fusion. To overcome this issue, we propose an innovative deep medical image segmentation framework termed Sub-pixel Multi-scale Fusion Network (SMFNet), which effectively incorporates the sub-pixel multi-scale feature fusion results of CNN and Transformer into the architecture. In particular, our design consists of three effective and practical modules. Primarily, we utilize the Sub-pixel Convolutional Module to synchronize the extracted features at multiple scales to a consistent resolution. In the next place, we develop the Three-level Enhancement Module to learn features from adjacent layers and perform information exchange. Lastly, we leverage the Hierarchical Adaptive Gate to fuse information from other contextual levels through the Sub-pixel Convolutional Module. Extensive experiments on the Synapse, ACDC, and ISIC 2018 datasets demonstrate the effectiveness of the proposed SMFNet, and our method is superior to other competitive CNN-based or Transformer-based segmentation methods.
Journal Article
Palmprint recognition based on gating mechanism and adaptive feature fusion
by
Bai, Litao
,
Xu, Guofeng
,
Yang, Xun
in
adaptive feature fusion
,
convolutional neural networks (CNN)
,
deep learning-based artificial neural networks
2023
As a type of biometric recognition, palmprint recognition uses unique discriminative features on the palm of a person to identify his/her identity. It has attracted much attention because of its advantages of contactlessness, stability, and security. Recently, many palmprint recognition methods based on convolutional neural networks (CNN) have been proposed in academia. Convolutional neural networks are limited by the size of the convolutional kernel and lack the ability to extract global information of palmprints. This paper proposes a framework based on the integration of CNN and Transformer-GLGAnet for palmprint recognition, which can take advantage of CNN's local information extraction and Transformer's global modeling capabilities. A gating mechanism and an adaptive feature fusion module are also designed for palmprint feature extraction. The gating mechanism filters features by a feature selection algorithm and the adaptive feature fusion module fuses them with the features extracted by the backbone network. Through extensive experiments on two datasets, the experimental results show that the recognition accuracy is 98.5% for 12,000 palmprints in the Tongji University dataset and 99.5% for 600 palmprints in the Hong Kong Polytechnic University dataset. This demonstrates that the proposed method outperforms existing methods in the correctness of both palmprint recognition tasks. The source codes will be available on https://github.com/Ywatery/GLnet.git .
Journal Article