Catalogue Search | MBRL

SST-YOLO: An Improved Autonomous Driving Object Detection Algorithm Based on YOLOv8

by Zhang, Shiyan , Du, Qinsheng , Zhao, Jian in Accuracy , Algorithms , autonomous driving

2026

As autonomous driving technology progresses, efficient and accurate object detectors are able to detect pedestrians, vehicles, road signs, and obstacles in real time, thereby enhancing driving safety and serving as a part of autonomous driving. However, the performance of such object detectors is limited and cannot be leveraged to satisfy modern autonomous driving systems. To address this issue, we develop an object detection network for autonomous driving scenarios, SST-YOLO, which is based on YOLOv8. First, we propose a Sobel Convolution & Convolution (SCC) module to enhance the backbone, which incorporates a SobelConv branch to explicitly model gradient-based edge information and improve structural feature representation. In addition, we replace the original path aggregation feature pyramid network (PAFPN) with a Small Object Augmentation Pyramid Network (SOAPN), which integrates SPDConv and CSP-OmniKernel modules to strengthen multi-scale feature fusion and enhance small object representation. Finally, a Task-Adaptive Decomposition & Alignment Head (TADAHead) is designed, which employs task decomposition, dynamic deformable convolution, and classification-aware modulation to decouple tasks and achieve adaptive spatial alignment, thereby improving detection accuracy and robustness in complex scenarios. Experiments on the public autonomous driving dataset KITTI show that our proposed method outperforms the baseline YOLOv8 model. Compared with the baseline results, mAP@0.5:0.95 ranges from 65.1% to 69.2%, which indicates that the proposed SST-YOLO network can achieve object detection for autonomous cars.

Journal Article

Share this book

Add to My Shelf

NAF-MEEF: A Nonlinear Activation-Free Network Based on Multi-Scale Edge Enhancement and Fusion for Railway Freight Car Image Denoising

by Yue, Jianhai , Chen, Jiawei , Hu, Zhunqing in Accuracy , Algorithms , attention mechanism

2025

Railwayfreight cars operating in heavy-load and complex outdoor environments are frequently subject to adverse conditions such as haze, temperature fluctuations, and transmission interference, which significantly degrade the quality of the acquired images and introduce substantial noise. Furthermore, the structural complexity of freight cars, coupled with the small size, diversity, and complex structure of defect areas, poses serious challenges for image denoising. Specifically, it becomes extremely difficult to remove noise while simultaneously preserving fine-grained textures and edge details. These challenges distinguish railway freight car image denoising from conventional image restoration tasks, necessitating the design of specialized algorithms that can achieve both effective noise suppression and precise structural detail preservation. To address the challenges of incomplete denoising and poor preservation of details and edge information in railway freight car images, this paper proposes a novel image denoising algorithm named the Nonlinear Activation-Free Network based on Multi-Scale Edge Enhancement and Fusion (NAF-MEEF). The algorithm constructs a Multi-scale Edge Enhancement Initialization Layer to strengthen edge information at multiple scales. Additionally, it employs a Nonlinear Activation-Free feature extractor that effectively captures local and global image information. Leveraging the network’s multi-branch parallelism, a Multi-scale Rotation Fusion Attention Mechanism is developed to perform weight analysis on information across various scales and dimensions. To ensure consistency in image details and structure, this paper introduces a fusion loss function. The experimental results show that compared with recent advanced methods, the proposed algorithm has better noise suppression and edge preservation performance. The proposed method achieves significant denoising performance on railway freight car images affected by Gaussian, composite, and simulated real-world noise, with PSNR gains of 1.20 dB, 1.45 dB, and 0.69 dB, and SSIM improvements of 2.23%, 2.72%, and 1.08%, respectively. On public benchmarks, it attains average PSNRs of 30.34 dB (Set12) and 28.94 dB (BSD68), outperforming several state-of-the-art methods. In addition, this method also performs well in railway image dehazing tasks and demonstrates good generalization ability in denoising tests of remote sensing ship images, further proving its robustness and practical application value in diverse image restoration tasks.

Journal Article

Share this book

Add to My Shelf

Robust deep learning-based fault detection of planetary gearbox using enhanced health data map under domain shift problem

by Youn, Byeng D , Hwang, Taewan , Ha, Jong Moon in Deep learning , Digital imaging , Fault detection

2023

Abstract The conventional deep learning-based fault diagnosis approach faces challenges under the domain shift problem, where the model encounters different working conditions from the ones it was trained on. This challenge is particularly pronounced in the diagnosis of planetary gearboxes due to the complicated vibrations they generate, which can vary significantly based on the system characteristics of the gearbox. To solve this challenge, this paper proposes a robust deep learning-based fault-detection approach for planetary gearboxes by utilizing an enhanced health data map (HDMap). Although there is an HDMap method that visually expresses the vibration signal of the planetary gearbox according to the gear meshing position, it is greatly influenced by machine operating conditions. In this study, domain-specific features from the HDMap are further removed, while the fault-related features are enhanced. Autoencoder-based residual analysis and digital image-processing techniques are employed to address the domain-shift problem. The performance of the proposed method was validated under significant domain-shift problem conditions, as demonstrated by studying two gearbox test rigs with different configurations operated under stationary and non-stationary operating conditions. Validation accuracy was measured in all 12 possible domain-shift scenarios. The proposed method achieved robust fault detection accuracy, outperforming prior methods in most cases. Graphical Abstract Graphical Abstract

Journal Article

Share this book

Add to My Shelf

A modified Canny edge detector based on weighted least squares

by Xu, Qin in Computer vision , Convolution , Edge detection

2021

Edge detection is the front-end processing stage in most computer vision and image understanding systems. Among various edge detection techniques, Canny edge detector is the one of most commonly used. In this paper a modified Canny edge detection technique focusing on change of the Sobel operator is proposed. Instead of convolution kernels, the weighted least squares method is utilized to calculate the horizontal and vertical gradient. Experimental results show that the new detector can detect some edges which are not observed in the results using the Canny edge detector.

Journal Article

Share this book

Add to My Shelf

Reliable IoT-based Health-care System for Diabetic Retinopathy Diagnosis to defend the Vision of Patients

by Christy Jeba Malar A , Karthick, S , Deva, Priya M in Accuracy , Algorithms , Blood vessels

2021

PurposeThe purpose of this paper is to design an Internet-of-Things (IoT) architecture-based Diabetic Retinopathy Detection Scheme (DRDS) proposed for identifying Type-I or Type-II diabetes and to specifically advise the Type-II diabetic patients about the possibility of vision loss.Design/methodology/approachThe proposed DRDS includes the benefits of automatic calculation of clip limit parameters and sub-window for making the detection process completely adaptive. It uses the advantages of extended 5 × 5 Sobels operator for estimating the maximum edges determined through the convolution of 24 pixels with eight templates to achieve 24 outputs corresponding to individual pixels for finding the maximum magnitude. It enhances the probability of connecting pixels in the vascular map with its closely located neighbourhood points in the fundus images. Then, the spatial information and kernel of the neighbourhood pixels are integrated through the Robust Semi-supervised Kernelized Fuzzy Local information C-Means Clustering (RSKFL-CMC) method to attain significant clustering process.FindingsThe results of the proposed DRDS architecture confirm the predominance in terms of accuracy, specificity and sensitivity. The proposed DRDS technique facilitates superior performance at an average of 99.64% accuracy, 76.84% sensitivity and 99.93% specificity.Research limitations/implicationsDRDS is proposed as a comfortable, pain-free and harmless diagnosis system using the merits of Dexcom G4 Plantinum sensors for estimating blood glucose level in diabetic patients. It uses the merits of RSKFL-CMC method to estimate the spatial information and kernel of the neighborhood pixels for attaining significant clustering process.Practical implicationsThe IoT architecture comprises of the application layer that inherits the DR application enabled Graphical User Interface (GUI) which is combined for processing of fundus images by using MATLAB applications. This layer aids the patients in storing the capture fundus images in the database for future diagnosis.Social implicationsThis proposed DRDS method plays a vital role in the detection of DR and categorization based on the intensity of disease into severe, moderate and mild grades. The proposed DRDS is responsible for preventing vision loss of diabetic Type-II patients by accurate and potential detection achieved through the utilization of IoT architecture.Originality/valueThe performance of the proposed scheme with the benchmarked approaches of the literature is implemented using MATLAB R2010a. The complete evaluations of the proposed scheme are conducted using HRF, REVIEW, STARE and DRIVE data sets with subjective quantification provided by the experts for the purpose of potential retinal blood vessel segmentation.

Journal Article

Share this book

Add to My Shelf

Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

by Kim, Jeom Goo , Kim, Cheong Ghil , Lee, Do Hyeon in Algorithms , Analysis , Architecture

2014

The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data- and task-level parallelism. For the performance evaluation, we implemented a 3 × 3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256 × 256, 512 × 512, and 1024 × 1024 data sets, respectively.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter