Catalogue Search | MBRL

A deep learning approach for flood inundation mapping in polarimetric SAR images using DCNv3 and vision transformer

by YU Haiyang , ZHANG Chunfang , LIU Peng in flood extent detection; polarimetric sar; deformable convolution; vision transformer

2025

【Objective】Accurate flood inundation detection using Synthetic Aperture Radar (SAR) images remains challenging due to limitations in existing models and the lack of high-quality annotated datasets. This study aims to address these issues by developing a dedicated flood inundation detection dataset based on polarimetric SAR data and proposing a novel deep learning model, FWSARNet, that integrates Deformable Convolutional Networks v3 (DCNv3) and Vision Transformer (ViT) to improve detection accuracy and robustness.【Method】A polarimetric SAR-based dataset was constructed using Sentinel-1 imagery, with extensive data augmentation to enhance model generalization. An efficient feature extraction module was designed by combining DCNv3’s spatial adaptability with ViT’s global feature modeling. This module served as the backbone of the FWSARNet model, which was then trained and validated on two custom-built datasets: Henan720 and Hebei727.【Result】The proposed FWSARNet model outperformed existing deep learning models in delineating complex flood features, including water body edges, small patches, and narrow linear segments. It achieved mean Intersection over Union (mIoU) values of 88.53% on Henan720 and 92.50% on Hebei727, indicating superior performance in diverse flood scenarios.【Conclusion】FWSARNet demonstrates high accuracy and adaptability in flood inundation detection from SAR images and is well-suited for emergency disaster response applications using polarimetric SAR data.

Journal Article

Share this book

Add to My Shelf

Enhancing geometric modeling in convolutional neural networks: limit deformable convolution

by Meng, Yuanze , Chang, Guiyong , Li, Shun in Artificial neural networks , Complexity , Computational Intelligence

2025

Convolutional neural networks (CNNs) are constrained in their capacity to model geometric transformations due to their fixed geometric structure. To overcome this problem, researchers introduce deformable convolution, which allows the convolution kernel to be deformable on the feature map. However, deformable convolution may introduce irrelevant contextual information during the learning process and thus affect the model performance. DCNv2 introduces a modulation mechanism to control the diffusion of the sampling points to control the degree of contribution of offsets through weights, but we find that such problems still exist in practical use. Therefore, we propose a new limit deformable convolution to address this problem, which enhances the model ability by adding adaptive limiting units to constrain the offsets and adjusts the weight constraints on the offsets to enhance the image-focusing ability. In the subsequent work, we perform lightweight work on the limit deformable convolution and design three kinds of LDBottleneck to adapt to different scenarios. The limit deformable network, equipped with the optimal LDBottleneck, demonstrated an improvement in mAP75 of 1.4% compared to DCNv1 and 1.1% compared to DCNv2 on the VOC2012+2007 dataset. Furthermore, on the CoCo2017 dataset, different backbones equipped with our limit deformable module achieved satisfactory results. The source code for this work is publicly available at https://github.com/1977245719/LDCN.

Journal Article

Share this book

Add to My Shelf

Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision

by Xie, Shuang , Sun, Hongwei in Algorithms , Analysis , attention mechanism

2023

Tea bud target detection is essential for mechanized selective harvesting. To address the challenges of low detection precision caused by the complex backgrounds of tea leaves, this paper introduces a novel model called Tea-YOLOv8s. First, multiple data augmentation techniques are employed to increase the amount of information in the images and improve their quality. Then, the Tea-YOLOv8s model combines deformable convolutions, attention mechanisms, and improved spatial pyramid pooling, thereby enhancing the model’s ability to learn complex object invariance, reducing interference from irrelevant factors, and enabling multi-feature fusion, resulting in improved detection precision. Finally, the improved YOLOv8 model is compared with other models to validate the effectiveness of the proposed improvements. The research results demonstrate that the Tea-YOLOv8s model achieves a mean average precision of 88.27% and an inference time of 37.1 ms, with an increase in the parameters and calculation amount by 15.4 M and 17.5 G, respectively. In conclusion, although the proposed approach increases the model’s parameters and calculation amount, it significantly improves various aspects compared to mainstream YOLO detection models and has the potential to be applied to tea buds picked by mechanization equipment.

Journal Article

Share this book

Add to My Shelf

FDD: a deep learning–based steel defect detectors

in Automation , Datasets , Deep learning

2023

Surface defects are a common issue that affects product quality in the industrial manufacturing process. Many companies put a lot of effort into developing automated inspection systems to handle this issue. In this work, we propose a novel deep learning–based surface defect inspection system called the forceful steel defect detector (FDD), especially for steel surface defect detection. Our model adopts the state-of-the-art cascade R-CNN as the baseline architecture and improves it with the deformable convolution and the deformable RoI pooling to adapt to the geometric shape of defects. Besides, our model adopts the guided anchoring region proposal to generate bounding boxes with higher accuracies. Moreover, to enrich the point of view of input images, we propose the random scaling and the ultimate scaling techniques in the training and inference process, respectively. The experimental studies on the Severstal steel dataset, NEU steel dataset, and DAGM dataset demonstrate that our proposed model effectively improved the detection accuracy in terms of the average recall (AR) and the mean average precision (mAP) compared to state-of-the-art defect detection methods. We expect our innovation to accelerate the automation of industrial manufacturing process by increasing the productivity and by sustaining high product qualities.

Journal Article

Share this book

Add to My Shelf

An Optimized YOLOv11 Framework for the Efficient Multi-Category Defect Detection of Concrete Surface

by Chen, Jiaying , Yang, Lei , Tian, Zhuang in Accuracy , Adaptability , Algorithms

2025

Thoroughly and accurately identifying various defects on concrete surfaces is crucial to ensure structural safety and prolong service life. However, in actual engineering inspections, the varying shapes and complexities of concrete structural defects challenge the insufficient robustness and generalization of mainstream models, often leading to misdetections and under-detections, which ultimately jeopardize structural safety. To overcome the disadvantages above, an efficient concrete defect detection model called YOLOv11-EMC (efficient multi-category concrete defect detection) is proposed. Firstly, ordinary convolution is substituted with a modified deformable convolution to efficiently extract irregular defect features, and the model’s robustness and generalization are significantly enhanced. Then, the C3k2module is integrated with a revised dynamic convolution module, which reduces unnecessary computations while enhancing flexibility and feature representation. Experiments show that, compared with Yolov11, Yolov11-EMC has improved precision, recall, mAP50, and F1 by 8.3%, 2.1%, 4.3%, and 3% respectively. Results of drone field tests show that Yolov11-EMC successfully lowers false and under-detections while simultaneously increasing detection accuracy, providing a superior methodology to tasks that require identifying tangible flaws in practical engineering applications.

Journal Article

Share this book

Add to My Shelf

DCEFsup.2-YOLO: Aerial Detection YOLO with Deformable Convolution–Efficient Feature Fusion for Small Target Detection

by Kim, Sungho , Shin, Heesub , Shin, Yeonha in Image processing , Machine learning , Methods

2024

Deep learning technology for real-time small object detection in aerial images can be used in various industrial environments such as real-time traffic surveillance and military reconnaissance. However, detecting small objects with few pixels and low resolution remains a challenging problem that requires performance improvement. To improve the performance of small object detection, we propose DCEF[sup.2] -YOLO. Our proposed method enables efficient real-time small object detection by using a deformable convolution (DFConv) module and an efficient feature fusion structure to maximize the use of the internal feature information of objects. DFConv preserves small object information by preventing the mixing of object information with the background. The optimized feature fusion structure produces high-quality feature maps for efficient real-time small object detection while maximizing the use of limited information. Additionally, modifying the input data processing stage and reducing the detection layer to suit small object detection also contributes to performance improvement. When compared to the performance of the latest YOLO-based models (such as DCN-YOLO and YOLOv7), DCEF[sup.2] -YOLO outperforms them, with a mAP of +6.1% on the DOTA-v1.0 test set, +0.3% on the NWPU VHR-10 test set, and +1.5% on the VEDAI512 test set. Furthermore, it has a fast processing speed of 120.48 FPS with an RTX3090 for 512 × 512 images, making it suitable for real-time small object detection tasks.

Journal Article

Share this book

Add to My Shelf

An optical flow estimation method based on multiscale anisotropic convolution

by Lv, Haofeng , Li, Yang , Wang, Jiaqi in Algorithms , Convolution , Formability

2024

To solve the tracking accuracy degradation problem in scenarios with large displacements or nonrigid motion during target tracking, this paper proposes an optical flow estimation method based on multiscale anisotropic convolution. The network structure is improved in a step-by-step manner by extracting the data flow from the network according to the observed features. For the low-level neural network, a layered multiscale structure is used to build a cascade network by using hybrid dilated convolution to obtain feature information at different scales while ensuring the tracking accuracy. For the upper-layer neural network, hybrid inflated deformable convolution is used by learning the contextual long-range correlations and multidirectional adaptive offsets of features. Experiments are conducted on the Flying Chairs, KITTI, and MPI datasets. The results show that compared with various popular algorithm methods, the model in this paper reduces endpoint errors while retaining edge information in regions with large displacements or nonrigid motion. Code is available at https://github.com/yifanna/MACFlow-pytorch.

Journal Article

Share this book

Add to My Shelf

Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images

by Liu, Songlin , Gao, Yinghui , Qin, Yao in Accuracy , attention module , Convolution

2022

We study the problem of object detection in remote sensing images. As a simple but effective feature extractor, Feature Pyramid Network (FPN) has been widely used in several generic vision tasks. However, it still faces some challenges when used for remote sensing object detection, as the objects in remote sensing images usually exhibit variable shapes, orientations, and sizes. To this end, we propose a dedicated object detector based on the FPN architecture to achieve accurate object detection in remote sensing images. Specifically, considering the variable shapes and orientations of remote sensing objects, we first replace the original lateral connections of FPN with Deformable Convolution Lateral Connection Modules (DCLCMs), each of which includes a 3×3 deformable convolution to generate feature maps with deformable receptive fields. Additionally, we further introduce several Attention-based Multi-Level Feature Fusion Modules (A-MLFFMs) to integrate the multi-level outputs of FPN adaptively, further enabling multi-scale object detection. Extensive experimental results on the DIOR dataset demonstrated the state-of-the-art performance achieved by the proposed method, with the highest mean Average Precision (mAP) of 73.6%.

Journal Article

Share this book

Add to My Shelf

Dynamic YOLO for small underwater object detection

by Chen, Jie , Er, Meng Joo in Ablation , Analysis , Artificial Intelligence

2024

The practical application of object detection inevitably encounters challenges posed by small objects. In underwater object detection, a crucial method for marine exploration, the presence of small objects in underwater environments significantly hampers the performance of detection. In this paper, a dynamic YOLO detector is proposed as a solution to alleviate this problem. Specifically, a light-weight backbone network is first constructed based on deformable convolution v3, with some specialized designs for small object detection. Secondly, a unified feature fusion framework based on channel-wise, scale-wise, and spatial-aware attention is proposed to fuse feature maps from different scales. This is particularly critical for detecting small objects since it allows us to fully exploit the enhanced capabilities offered by our proposed backbone network. Finally, a simple but effective detection head is designed to handle the conflict between classification and localization by disentangling and aligning the two tasks. Extensive experiments are conducted on benchmark datasets to demonstrate the effectiveness of the proposed model. Without bells and whistles, dynamic YOLO outperforms the recent state-of-the-art methods by a large margin of + 0.8 AP and + 1.8 AP S on the DUO dataset. Experimental results on Pascal VOC and MS COCO datasets also demonstrate the superiority of the proposed method. At last, ablation studies are conducted on DUO dataset to validate the effectiveness and efficiency of each design in dynamic YOLO. Source code will be available at https://github.com/chenjie04/Dynamic-YOLO .

Journal Article

Share this book

Add to My Shelf

ADFCNN-BiLSTM: A Deep Neural Network Based on Attention and Deformable Convolution for Network Intrusion Detection

by Li, Bin , Jia, Mingyu , Li, Jie in Algorithms , attention mechanism , bidirectional long short-term memory

2025

Network intrusion detection systems can identify intrusion behavior in a network by analyzing network traffic data. It is challenging to detect a very small proportion of intrusion data from massive network traffic and identify the attack class in intrusion detection tasks. Many existing intrusion detection studies often fail to fully extract the spatial features of network traffic and make reasonable use of temporal features. In this paper, we propose ADFCNN-BiLSTM, a novel deep neural network for network intrusion detection. ADFCNN-BiLSTM uses deformable convolution and an attention mechanism to adaptively extract the spatial features of network traffic data, and it pays attention to the important features from both channel and spatial perspectives. It uses BiLSTM to mine the temporal features from the traffic data and employs the multi-head attention mechanism to allow the network to focus on the time-series information related to suspicious traffic. In addition, ADFCNN-BiLSTM addresses the issue of class imbalance during the training process at both the data level and algorithm level. We evaluated the proposed ADFCNN-BiLSTM on three standard datasets, i.e., NSL-KDD, UNSW-NB15, and CICDDoS2019. The experimental results show that ADFCNN-BiLSTM outperforms the state-of-the-art model in terms of accuracy, detection rate, and false-positive rate.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter