Catalogue Search | MBRL

YOLOv5_(m)amba: unmanned aerial vehicle object detection based on bidirectional dense feedback network and adaptive gate feature fusion

by Guo, Chengcheng , Wu, Shixiao , Lu, Xingyuan in Adaptive gate feature fusion , Mamba , Object detection

2024

Addressing the problem that the object size in Unmanned Aerial Vehicles (UAVs) aerial images is too small and contains limited feature information, leading to existing detection algorithms having less than ideal performance in small object detection, we propose a UAV aerial object detection system named YOLv5_mamba based on bidirectional dense feedback network and adaptive gate feature fusion. This paper improves the You Only Look Once Version 5 (YOLOv5) algorithm by firstly introducing the Faster Implementation of CSP Bottleneck with 2 convolutions (C2f) module from YOLOv8 into the backbone network to enhance the feature extraction capability of the backbone network. Furthermore, the mamba module and C2f module are introduced to construct a bidirectional dense feedback network to enhance the transfer of contextual information in the neck part. Thirdly, an adaptive gate feature fusion network is proposed to improve the head part of YOLOv5 and enhance its final detection capability. Experimental results on the public UAV aerial dataset VisDrone2019 demonstrate that the proposed algorithm improves the detection accuracy by 9.3% compared to the original YOLOv5 baseline network, showing better detection performance for small objects. For the UCAS_AOD dataset, the proposed algorithm outperforms YOLOv5-s by 9%. In the case of the DIOR dataset, the proposed algorithm exceeds YOLOv5-s by 12%.

Journal Article

Share this book

Add to My Shelf

YOLOv5_mamba: unmanned aerial vehicle object detection based on bidirectional dense feedback network and adaptive gate feature fusion

by Guo, Chengcheng , Wu, Shixiao , Lu, Xingyuan in 639/705 , 639/705/117 , Algorithms

2024

Journal Article

Share this book

Add to My Shelf

A dual-domain perception gate-controlled adaptive fusion algorithm for road crack detection

by Feng, Yong’an , Zhang, Ziyang in 631/378/116/1925 , 631/378/116/2396 , Defect detection

2025

Road crack detection presents critical challenges, including diverse defect patterns and complex anomaly characteristics. The current object detection algorithms demonstrate deficiencies in considering feature redundancy across channel-spatial dimensions, employ indiscriminate fusion strategies for multi-stage feature information, and particularly neglect the high-frequency characteristics inherent in crack features, leading to inefficient network performance and a loss of crucial information. Building upon the identified limitations, this paper proposes a dual-domain perception gate-controlled adaptive fusion network (DP-DETR) that achieves dynamic perception of salient features across channel and spatial domains within latent space. To enhance focus on critical features, a dual-domain dynamic perception information distillation mechanism is constructed, which distills redundant features separately across channel and spatial domains, effectively reducing architectural processing redundancy while achieving discriminative characteristic representation efficiency. In order to address the challenge of coarse-grained fusion in multi-stage feature integration, a feature information gating-adaptive fusion module (FGAF-Fusion) is proposed, which facilitates interactive channel-spatial information fusion through mixed local channel attention while employing gated adaptive fusion operations to selectively retain critical semantic information of small-scale targets. In response to the persistent high-frequency signature identified within crack feature distributions, a dual-domain structural feature enhancement loss function is designed, which elevates the weighting of high-frequency information by leveraging a spectral weighting matrix, while complementarily enhancing crack edge texture features in the spatial domain through gradient map integration. The experimental results obtained on the public RDD2022 dataset demonstrate that the proposed DP-DETR (Dual-Domain Perception Gate-Controlled Adaptive Fusion Network) approach mAP50 and mAP50:95 values of 54.2% and 25.8%, respectively, representing improvements of 6.7 and 4.2 percentage points over RT-DETR. In road crack object detection tasks, the proposed DP-DETR method can effectively detect various types of road defects, demonstrating highly competitive detection results and good robustness. The code will be released at https://github.com/jiangsu415/DP-DETR .

Journal Article

Share this book

Add to My Shelf

MSGFNet: Multi-Scale Gated Fusion Network for Remote Sensing Image Change Detection

by Hao, Zhonghu , Wang, Qiang , Ye, Yuanxin in Ablation , Accuracy , Artificial intelligence

2024

Change detection (CD) stands out as a pivotal yet challenging task in the interpretation of remote sensing images. Significant developments have been witnessed, particularly with the rapid advancements in deep learning techniques. Nevertheless, challenges such as incomplete detection targets and unsmooth boundaries remain as most CD methods suffer from ineffective feature fusion. Therefore, this paper presents a multi-scale gated fusion network (MSGFNet) to improve the accuracy of CD results. To effectively extract bi-temporal features, the EfficientNetB4 model based on a Siamese network is employed. Subsequently, we propose a multi-scale gated fusion module (MSGFM) that comprises a multi-scale progressive fusion (MSPF) unit and a gated weight adaptive fusion (GWAF) unit, aimed at fusing bi-temporal multi-scale features to maintain boundary details and detect completely changed targets. Finally, we use the simple yet efficient UNet structure to recover the feature maps and predict results. To demonstrate the effectiveness of the MSGFNet, the LEVIR-CD, WHU-CD, and SYSU-CD datasets were utilized, and the MSGFNet achieved F1 scores of 90.86%, 92.46%, and 80.39% on the three datasets, respectively. Furthermore, the low computational costs and small model size have validated the superior performance of the MSGFNet.

Journal Article

Share this book

Add to My Shelf

Sub-pixel multi-scale fusion network for medical image segmentation

by Fang, Xian , Li, Jing , Chen, Qiaohong in 1237: Advanced Deep Learning for Computer Vision and Multimedia Applications , Accuracy , Artificial neural networks

2024

CNNs and Transformers have significantly advanced the domain of medical image segmentation. The integration of their strengths facilitates rich feature extraction but also introduces the challenge of mixed multi-scale feature fusion. To overcome this issue, we propose an innovative deep medical image segmentation framework termed Sub-pixel Multi-scale Fusion Network (SMFNet), which effectively incorporates the sub-pixel multi-scale feature fusion results of CNN and Transformer into the architecture. In particular, our design consists of three effective and practical modules. Primarily, we utilize the Sub-pixel Convolutional Module to synchronize the extracted features at multiple scales to a consistent resolution. In the next place, we develop the Three-level Enhancement Module to learn features from adjacent layers and perform information exchange. Lastly, we leverage the Hierarchical Adaptive Gate to fuse information from other contextual levels through the Sub-pixel Convolutional Module. Extensive experiments on the Synapse, ACDC, and ISIC 2018 datasets demonstrate the effectiveness of the proposed SMFNet, and our method is superior to other competitive CNN-based or Transformer-based segmentation methods.

Journal Article

Share this book

Add to My Shelf

Palmprint recognition based on gating mechanism and adaptive feature fusion

by Bai, Litao , Xu, Guofeng , Yang, Xun in adaptive feature fusion , convolutional neural networks (CNN) , deep learning-based artificial neural networks

2023

As a type of biometric recognition, palmprint recognition uses unique discriminative features on the palm of a person to identify his/her identity. It has attracted much attention because of its advantages of contactlessness, stability, and security. Recently, many palmprint recognition methods based on convolutional neural networks (CNN) have been proposed in academia. Convolutional neural networks are limited by the size of the convolutional kernel and lack the ability to extract global information of palmprints. This paper proposes a framework based on the integration of CNN and Transformer-GLGAnet for palmprint recognition, which can take advantage of CNN's local information extraction and Transformer's global modeling capabilities. A gating mechanism and an adaptive feature fusion module are also designed for palmprint feature extraction. The gating mechanism filters features by a feature selection algorithm and the adaptive feature fusion module fuses them with the features extracted by the backbone network. Through extensive experiments on two datasets, the experimental results show that the recognition accuracy is 98.5% for 12,000 palmprints in the Tongji University dataset and 99.5% for 600 palmprints in the Hong Kong Polytechnic University dataset. This demonstrates that the proposed method outperforms existing methods in the correctness of both palmprint recognition tasks. The source codes will be available on https://github.com/Ywatery/GLnet.git .

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter