Catalogue Search | MBRL

Improving YOLOv8m with Neck-Integrated Atrous Spatial Pyramid Pooling for Enhanced Detection of Small Fish and Jellyfish

by Hussein, Shaymaa Khudhur , Nemer, Zainab N.

2025

The ocean depths are vital for biodiversity, as they host numerous marine species essential for maintaining ecosystem health. Accurate identification of aquatic creatures is critical for developing effective conservation strategies and sustainable marine resource management. However, aquatic environments pose distinct challenges, including light scattering, occlusions, and turbidity. This paper offers an improved YOLOv8m architecture that integrates ASPP module ( Atrous Spatial Pyramid Pooling) that enhances multi scale feature extraction. Our proposed model, YOLOv8m-ASPP, was evaluated against the baseline YOLOv8m on the Brackish dataset, which contains 8,417 annotated images across six marine categories (“crab, jellyfish, fish, shrimp, small_fish, and starfish”).The key architectural innovation involves integrating the ASPP module, with dilation rates of [2, 4, 6], into YOLOv8m's neck, specifically after the SPPF layer. This placement allows the ASPP module to process rich contextual features from the backbone, improving the ability of the model to capture objects at various scales. The YOLOv8m-ASPP model achieved an overall mAP@50 of 0.991 (a +0.002 increase) and a mAP@50-95 of 0.832 (a +0.004 increase) compared to the baseline YOLOv8m's 0.989 mAP@50 and 0.828 mAP@50-95. The modified model showed a precision of 0.980 and recall of 0.979., operating at approximately 60 FPS. Performance notably improved for challenging classes: the 'jellyfish' class mAP@50-95 rose to 0.757 (from the baseline's 0.730). Furthermore, robustness in small object detection was evident, with the 'small_fish' class achieving 0.970 mAP@50 (up from the baseline's 0.960 mAP@50).The findings demonstrate the effectiveness of the YOLOv8m-ASPP model for underwater ecological monitoring, successfully maintaining both detection accuracy and real-time processing capabilities. Future research could explore improved detection methods for small objects in environments with high turbidity.

Journal Article

Share this book

Add to My Shelf

Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion

by He, Qi , Xu, Shubo , Song, Wei in Accuracy , Algorithms , attention mechanism

2021

A challenging and attractive task in computer vision is underwater object detection. Although object detection techniques have achieved good performance in general datasets, problems of low visibility and color bias in the complex underwater environment have led to generally poor image quality; besides this, problems with small targets and target aggregation have led to less extractable information, which makes it difficult to achieve satisfactory results. In past research of underwater object detection based on deep learning, most studies have mainly focused on improving detection accuracy by using large networks; the problem of marine underwater lightweight object detection has rarely gotten attention, which has resulted in a large model size and slow detection speed; as such the application of object detection technologies under marine environments needs better real-time and lightweight performance. In view of this, a lightweight underwater object detection method based on the MobileNet v2, You Only Look Once (YOLO) v4 algorithm and attentional feature fusion has been proposed to address this problem, to produce a harmonious balance between accuracy and speediness for target detection in marine environments. In our work, a combination of MobileNet v2 and depth-wise separable convolution is proposed to reduce the number of model parameters and the size of the model. The Modified Attentional Feature Fusion (AFFM) module aims to better fuse semantic and scale-inconsistent features and to improve accuracy. Experiments indicate that the proposed method obtained a mean average precision (mAP) of 81.67% and 92.65% on the PASCAL VOC dataset and the brackish dataset, respectively, and reached a processing speed of 44.22 frame per second (FPS) on the brackish dataset. Moreover, the number of model parameters and the model size were compressed to 16.76% and 19.53% of YOLO v4, respectively, which achieved a good tradeoff between time and accuracy for underwater object detection.

Journal Article

Share this book

Add to My Shelf

A lightweight YOLOv8 integrating FasterNet for real-time underwater object detection

by Guo, An , Sun, Kaiqiong , Zhang, Ziyi in Accuracy , Algorithms , Computer Graphics

2024

In this paper, we propose a underwater target detection method that optimizes YOLOv8s to make it more suitable for real-time and underwater environments. First, a lightweight FasterNet module replaces the original backbone of YOLOv8s to reduce the computation and improve the performance of the network. Second, we modify current bi-directional feature pyramid network into a fast one by reducing unnecessary feature layers and changing the fusion method. Finally, we propose a lightweight-C2f structure by replacing the last standard convolution, bottleneck module of C2f with a GSConv and a partial convolution, respectively, to obtain a lighter and faster block. Experiments on three underwater datasets, RUOD, UTDAC2020 and URPC2022 show that the proposed method has mAP 50 of 86.8%, 84.3% and 84.7% for the three datasets, respectively, with a speed of 156 FPS on NVIDIA A30 GPUs, which meets the requirement of real-time detection. Compared to the YOLOv8s model, the model volume is reduced on average by 24%, and the mAP accuracy is enhanced on all three datasets.

Journal Article

Share this book

Add to My Shelf

Dynamic YOLO for small underwater object detection

by Chen, Jie , Er, Meng Joo in Ablation , Analysis , Artificial Intelligence

2024

The practical application of object detection inevitably encounters challenges posed by small objects. In underwater object detection, a crucial method for marine exploration, the presence of small objects in underwater environments significantly hampers the performance of detection. In this paper, a dynamic YOLO detector is proposed as a solution to alleviate this problem. Specifically, a light-weight backbone network is first constructed based on deformable convolution v3, with some specialized designs for small object detection. Secondly, a unified feature fusion framework based on channel-wise, scale-wise, and spatial-aware attention is proposed to fuse feature maps from different scales. This is particularly critical for detecting small objects since it allows us to fully exploit the enhanced capabilities offered by our proposed backbone network. Finally, a simple but effective detection head is designed to handle the conflict between classification and localization by disentangling and aligning the two tasks. Extensive experiments are conducted on benchmark datasets to demonstrate the effectiveness of the proposed model. Without bells and whistles, dynamic YOLO outperforms the recent state-of-the-art methods by a large margin of + 0.8 AP and + 1.8 AP S on the DUO dataset. Experimental results on Pascal VOC and MS COCO datasets also demonstrate the superiority of the proposed method. At last, ablation studies are conducted on DUO dataset to validate the effectiveness and efficiency of each design in dynamic YOLO. Source code will be available at https://github.com/chenjie04/Dynamic-YOLO .

Journal Article

Share this book

Add to My Shelf

HTDet: A Hybrid Transformer-Based Approach for Underwater Small Object Detection

by Chen, Gangqi , Wang, Kai , Shen, Junge in Accuracy , Algorithms , Aquaculture

2023

As marine observation technology develops rapidly, underwater optical image object detection is beginning to occupy an important role in many tasks, such as naval coastal defense tasks, aquaculture, etc. However, in the complex marine environment, the images captured by an optical imaging system are usually severely degraded. Therefore, how to detect objects accurately and quickly under such conditions is a critical problem that needs to be solved. In this manuscript, a novel framework for underwater object detection based on a hybrid transformer network is proposed. First, a lightweight hybrid transformer-based network is presented that can extract global contextual information. Second, a fine-grained feature pyramid network is used to overcome the issues of feeble signal disappearance. Third, the test-time-augmentation method is applied for inference without introducing additional parameters. Extensive experiments have shown that the approach we have proposed is able to detect feeble and small objects in an efficient and effective way. Furthermore, our model significantly outperforms the latest advanced detectors with respect to both the number of parameters and the mAP by a considerable margin. Specifically, our detector outperforms the baseline model by 6.3 points, and the model parameters are reduced by 28.5 M.

Journal Article

Share this book

Add to My Shelf

MAS-YOLOv11: An Improved Underwater Object Detection Algorithm Based on YOLOv11

by Fu, Qingqing , Wu, Aiping , Luo, Yang in Accuracy , Algorithms , Artificial intelligence

2025

To address the challenges of underwater target detection, including complex background interference, light attenuation, severe occlusion, and overlap between targets, as well as the wide-scale variation in objects, we propose MAS-YOLOv11, an improved model integrating three key enhancements: First, we introduce the C2PSA_MSDA module, which integrates multi-scale dilated attention (MSDA) into the C2PSA module of the backbone, enhancing multi-scale feature representation via dilated convolutions and cross-scale attention. Second, an adaptive spatial feature fusion detection head (ASFFHead) replaces the original head. By employing learnable spatial weighting parameters, ASFFHead adaptively fuses features across different scales, significantly improving the robustness of multi-scale object detection. Third, we introduce a Slide Loss function with dynamic sample weighting to enhance hard sample learning. By mapping the loss weights nonlinearly to detection confidence, this mechanism effectively enhances the overall detection accuracy. The experimental results demonstrate that the improved model yields significant performance advancements on the DUO dataset: the recall rate is enhanced by 3.7%, the F1-score is elevated by 3%, and the mAP@50 and mAP@50-95 attain values of 77.4% and 55.1%, respectively, representing increases of 3.5% and 3.3% compared to the baseline model. Furthermore, the model achieves an mAP@50 of 76% on the RUOD dataset, which further corroborates its cross-domain generalization capability.

Journal Article

Share this book

Add to My Shelf

Understanding the Influence of Image Enhancement on Underwater Object Detection: A Quantitative and Qualitative Study

by Awad, Ali , Lucas, Evan , Paheding, Sidike in Algorithms , Comparative analysis , Computer vision

2025

Underwater image enhancement is often perceived as a disadvantageous process to object detection. We propose a novel analysis of the interactions between enhancement and detection, elaborating on the potential of enhancement to improve detection. In particular, we evaluate object detection performance for each individual image rather than across the entire set to allow a direct performance comparison of each image before and after enhancement. This approach enables the generation of unique queries to identify the outperforming and underperforming enhanced images compared to the original images. To accomplish this, we first produce enhanced image sets of the original images using recent image enhancement models. Each enhanced set is then divided into two groups: (1) images that outperform or match the performance of the original images and (2) images that underperform. Subsequently, we create mixed original-enhanced sets by replacing underperforming enhanced images with their corresponding original images. Next, we conduct a detailed analysis by evaluating all generated groups for quality and detection performance attributes. Finally, we perform an overlap analysis between the generated enhanced sets to identify cases where the enhanced images of different enhancement algorithms unanimously outperform, equally perform, or underperform the original images. Our analysis reveals that, when evaluated individually, most enhanced images achieve equal or superior performance compared to their original counterparts. The proposed method uncovers variations in detection performance that are not apparent in a whole set as opposed to a per-image evaluation because the latter reveals that only a small percentage of enhanced images cause an overall negative impact on detection. We also find that over-enhancement may lead to deteriorated object detection performance. Lastly, we note that enhanced images reveal hidden objects that were not annotated due to the low visibility of the original images.

Journal Article

Share this book

Add to My Shelf

BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection

by Yan, Xinyue , Zhang, Ruiteng , Cao, Ruicheng in Accuracy , Algorithms , Deep learning

2024

Degraded underwater images decrease the accuracy of underwater object detection. Existing research uses image enhancement methods to improve the visual quality of images, which may not be beneficial in underwater image detection and lead to serious degradation in detector performance. To alleviate this problem, we proposed a bidirectional guided method for underwater object detection, referred to as BG-YOLO. In the proposed method, a network is organized by constructing an image enhancement branch and an object detection branch in a parallel manner. The image enhancement branch consists of a cascade of an image enhancement subnet and object detection subnet. The object detection branch only consists of a detection subnet. A feature-guided module connects the shallow convolution layers of the two branches. When training the image enhancement branch, the object detection subnet in the enhancement branch guides the image enhancement subnet to be optimized towards the direction that is most conducive to the detection task. The shallow feature map of the trained image enhancement branch is output to the feature-guided module, constraining the optimization of the object detection branch through consistency loss and prompting the object detection branch to learn more detailed information about the objects. This enhances the detection performance. During the detection tasks, only the object detection branch is reserved so that no additional computational cost is introduced. Extensive experiments demonstrate that the proposed method significantly improves the detection performance of the YOLOv5s object detection network (the mAP is increased by up to 2.9%) and maintains the same inference speed as YOLOv5s (132 fps).

Journal Article

Share this book

Add to My Shelf

Unsupervised clustering optimization-based efficient attention in YOLO for underwater object detection

by Yuan, Guoliang , Shen, Xin , Wang, Huibing in Algorithms , Artificial Intelligence , Attention

2025

Underwater object detection is a prerequisite for underwater robots to realize ocean exploration and autonomous grasping. However, underwater detection tasks face some inevitable interference factors, such as poor imaging quality, strong environment randomness, and high organism concealment. These phenomena will lead to strong underwater background interference and weak underwater object perception, which greatly aggravates the difficulty of underwater object detection. In order to deal with the above problems, we propose an unsupervised clustering optimization-based efficient attention (UCOEA). Different from the channel-wise strategy, cross-channel strategy and channel grouping strategy, we design a channel clustering strategy, which achieves autonomous dynamic screening of channel information by using the K-Means algorithm. Same types of channel information with high redundancy are learned uniformly to share the same operation. Different types of channel information with high specificity are learned independently to avoid channel noise information interference. Different from the single spatial strategy and multiple spatial strategy, we design a spatial clustering strategy, which achieves autonomous dynamic stripping of spatial information by using the EM algorithm. This strategy can extract multiple required spatial information at one time from different spatial locations. We further assign learnable weight parameters to distinguish dominant information and auxiliary information, which can alleviate spatial noise information interference. Our strategies can better balance additional cost overhead and information processing quality, which is crucial for the proposed attention to achieve fast and accurate underwater information calibration. In order to achieve high-precision and real-time underwater object detection, we propose a combined system of UCOEA underwater adapter and one-stage YOLO detector, which can efficiently detect small, medium and large targets at the same time. Extensive experiments demonstrate the effectiveness of our work. More importantly, we publish an underwater detection dataset DLMU2024 with low image continuity and high data diversity, which provides reliable support for the rapid development of underwater detection research. Our dataset is available at https://github.com/shenxin-dlmu/DLMU2024 .

Journal Article

Share this book

Add to My Shelf

AB-YOLOv8: Attention-based Feature Extraction model for Underwater Object Detection

by De, Sourav , Gurung, Sandeep , Misra, Bitan

2025

Accurate and timely underwater object detection is crucial in the field of marine environmental engineering. The detection of such targets has been improved recently using techniques based on Convolutional Neural Networks (CNN). However, the processing performance of deep neural networks is typically inadequate due to their high parameter requirements. Accurate detection is difficult with current techniques when dealing with small, close-packed underwater targets. In order to overcome these problems, the proposed work combined YOLOv8 with different attention modules and proposed a novel neural network model to enhance underwater object detection capabilities. In this research, AB-YOLOv8 is proposed, which adds the attention mechanism to the original YOLOv8 design. To be more precise, the proposed work introduced four attention modules, Convolutional Block Efficient Channel Attention (ECA), Shuffle Attention (SA), Global Attention Mechanism (GAM), and Attention Module (CBAM), to create the enhanced models and train them in the aquarium dataset. Each of the attention blocks is combined with YOLOv8 to improve the performance of the entire object detection. The residual block is introduced into the CBAM to optimize the performance of the CBAM. The detailed experiments are conducted on the aquarium dataset, and various performance assessment parameters are used, like mAP, FLOPS, Params, inference time, etc. After performing the experiment, it was found that ECA gives the best result out of all attention blocks and improved mAP value by 8%, also reduced the number of parameters generated during training. To validate the work,we also performed the experiment on the Brackish dataset, and we found that ECA outperforms other attention mechanisms with YOLOv8.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter