Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
328 result(s) for "2D detection"
Sort by:
C2L3-Fusion: An Integrated 3D Object Detection Method for Autonomous Vehicles
Accurate 3D object detection is crucial for autonomous vehicles (AVs) to navigate safely in complex environments. This paper introduces a novel fusion framework that integrates Camera image-based 2D object detection using YOLOv8 and LiDAR data-based 3D object detection using PointPillars, hence named C2L3-Fusion. Unlike conventional fusion approaches, which often struggle with feature misalignment, C2L3-Fusion enhances spatial consistency and multi-level feature aggregation, significantly improving detection accuracy. Our method outperforms state-of-the-art approaches such as YoPi-CLOCs Fusion Network, standalone YOLOv8, and standalone PointPillars, achieving mean Average Precision (mAP) scores of 89.91% (easy), 79.26% (moderate), and 78.01% (hard) on the KITTI dataset. Successfully implemented on the Nvidia Jetson AGX Xavier embedded platform, C2L3-Fusion maintains real-time performance while enhancing robustness, making it highly suitable for self-driving vehicles. This paper details the methodology, mathematical formulations, algorithmic advancements, and real-world testing of C2L3-Fusion, offering a comprehensive solution for 3D object detection in autonomous navigation.
A feature cascade and recursive fusion architecture for traffic sign detection in vehicle perception
In driving scenarios, traffic sign detection technology frequently suffers from reduced detection accuracy of models due to practical challenges such as tiny object, scale variation, and fluctuating lighting conditions. Specifically, the representational information of tiny traffic signs becomes progressively blurred or even entirely lost as the receptive field expands. Furthermore, existing feature fusion networks place greater emphasis on information integration for conventionally sized objects, thereby exacerbating the insufficiency of fused feature information for small objects. To address the aforementioned issues, we propose an end-to-end traffic sign detection algorithm, termed EMB-RFNet, designed to improve the detection accuracy of small traffic signs. It aggregates multi-scale fine-grained object information by designing a recursive path fusion network. Concurrently, it employs an efficient multi-branch cross-stage partial network to enhance multi-scale feature capture, while utilizing a shallow feature supplementation mechanism to compensate for semantic information loss in small objects. We validate the effectiveness of EMB-RFNet through experiments on the TT100K, GTSDB, and CCTSDB public datasets. Its mAP@0.5 metrics achieve 84.9%, 91.5%, and 84.1% respectively across the three datasets, significantly improving traffic sign detection performance for small objects. This demonstrates the superior capability of EMB-RFNet in traffic sign detection, particularly for detecting small objects.
Monocular 3D Object Detection Based on Pseudo Multimodal Information Extraction and Keypoint Estimation
Three-dimensional object detection is an essential and fundamental task in the field of computer vision which can be widely used in various scenarios such as autonomous driving and visual navigation. In view of the current insufficient utilization of image information in current monocular camera-based 3D object detection algorithms, we propose a monocular 3D object detection algorithm based on pseudo-multimodal information extraction and keypoint estimation. We utilize the original image to generate pseudo-lidar and a bird’s-eye view, and then feed the fused data of the original image and pseudo-lidar to the keypoint-based network for an initial 3D box estimation, finally using the bird’s-eye view to refine the initial 3D box. The experimental performance of our method exceeds state-of-the-art algorithms under the evaluation criteria of 3D object detection and localization on the KITTI dataset, achieving the best experimental performance so far.
Simplification of strip-wise algorithms applied for two-dimensional intersymbol interference detection
With the development of the high-speed communication channel or high-capacity storage device, the intersymbol interference (ISI) occurs more frequently in the condensed data. The traditional Bahl – Cocke – Jelineu – Raviv (BCJR) method is capable of solving the two-dimensional (2D) symbol influence problem with extremely high cost. By analysing the complexity in terms of branch, state and path in the 2D trellis-based ISI detection, the authors verify that the system complexity is mainly dominated by the window size. To reduce the detection complexity, the conventional simplifications are applied in the 2D strip-wise ISI detection. In addition, a two-phase strip-wise detection is proposed to achieve a lower complexity than that of the conventional simplifications. In this system, the first detector can be realised by a low-complexity method (like the hard detection or Viterbi algorithm) to provide the mask shrinking with predicted data. After reducing the mask size, the second detector can detect the main portion of ISI by the trellis-based detection (like BCJR algorithm) with a small mask. Under the similar detection performance, the proposed scheme can achieve more than 91.76% saving in the metric computation compared with the 5 × 3-window IRCSDFA (iterative row–column soft-decision feedback algorithm) simplified by the M-algorithm with M = 8.
A Survey of Computer Vision Methods for 2D Object Detection from Unmanned Aerial Vehicles
The spread of Unmanned Aerial Vehicles (UAVs) in the last decade revolutionized many applications fields. Most investigated research topics focus on increasing autonomy during operational campaigns, environmental monitoring, surveillance, maps, and labeling. To achieve such complex goals, a high-level module is exploited to build semantic knowledge leveraging the outputs of the low-level module that takes data acquired from multiple sensors and extracts information concerning what is sensed. All in all, the detection of the objects is undoubtedly the most important low-level task, and the most employed sensors to accomplish it are by far RGB cameras due to costs, dimensions, and the wide literature on RGB-based object detection. This survey presents recent advancements in 2D object detection for the case of UAVs, focusing on the differences, strategies, and trade-offs between the generic problem of object detection, and the adaptation of such solutions for operations of the UAV. Moreover, a new taxonomy that considers different heights intervals and driven by the methodological approaches introduced by the works in the state of the art instead of hardware, physical and/or technological constraints is proposed.
Multiscale Object Detection from Drone Imagery Using Ensemble Transfer Learning
Object detection in uncrewed aerial vehicle (UAV) images has been a longstanding challenge in the field of computer vision. Specifically, object detection in drone images is a complex task due to objects of various scales such as humans, buildings, water bodies, and hills. In this paper, we present an implementation of ensemble transfer learning to enhance the performance of the base models for multiscale object detection in drone imagery. Combined with a test-time augmentation pipeline, the algorithm combines different models and applies voting strategies to detect objects of various scales in UAV images. The data augmentation also presents a solution to the deficiency of drone image datasets. We experimented with two specific datasets in the open domain: the VisDrone dataset and the AU-AIR Dataset. Our approach is more practical and efficient due to the use of transfer learning and two-level voting strategy ensemble instead of training custom models on entire datasets. The experimentation shows significant improvement in the mAP for both VisDrone and AU-AIR datasets by employing the ensemble transfer learning method. Furthermore, the utilization of voting strategies further increases the 3reliability of the ensemble as the end-user can select and trace the effects of the mechanism for bounding box predictions.
3DCD: A NEW DATASET FOR 2D AND 3D CHANGE DETECTION USING DEEP LEARNING TECHNIQUES
Change detection is one of the main topics in Earth Observation, due to its wide range of applications, varying from urban development monitoring to natural disaster management. Most of the recently developed change detection methodologies rely on the use of deep learning algorithms. These kinds of algorithms are generally focused on generating two-dimensional (2D) change maps, thus they are only able to detect horizontal changes in land use/land cover, not considering nor returning any information on the corresponding elevation changes. Our work proposes a step forward, creating and sharing a dataset where two optical images acquired in different epochs are provided together with both the related 2D change maps containing land use/land cover variations and the three-dimensional (3D) maps containing elevation changes. Particularly, our aim is to provide a dataset useful to address and possibly solve the change detection task in 3D. Indeed, the proposed dataset, on the one hand, can empower a further development of 2D change detection algorithms, and, on the other hand, can allow to develop algorithms able to provide 3D change detection maps from two optical images captured in different epochs, without the need to rely directly on elevation data as input. The proposed dataset is publicly available at the following link: https://bit.ly/3wDdo41.
Image-Based Pothole Detection Using Multi-Scale Feature Network and Risk Assessment
Potholes on road surfaces pose a serious hazard to vehicles and passengers due to the difficulty detecting them and the short response time. Therefore, many government agencies are applying various pothole-detection algorithms for road maintenance. However, current methods based on object detection are unclear in terms of real-time detection when using low-spec hardware systems. In this study, the SPFPN-YOLOv4 tiny was developed by combining spatial pyramid pooling and feature pyramid network with CSPDarknet53-tiny. A total of 2665 datasets were obtained via data augmentation, such as gamma regulation, horizontal flip, and scaling to compensate for the lack of data, and were divided into training, validation, and test of 70%, 20%, and 10% ratios, respectively. As a result of the comparison of YOLOv2, YOLOv3, YOLOv4 tiny, and SPFPN-YOLOv4 tiny, the SPFPN-YOLOv4 tiny showed approximately 2–5% performance improvement in the mean average precision (intersection over union = 0.5). In addition, the risk assessment based on the proposed SPFPN-YOLOv4 tiny was calculated by comparing the tire contact patch size with pothole size by applying the pinhole camera and distance estimation equation. In conclusion, we developed an end-to-end algorithm that can detect potholes and classify the risks in real-time using 2D pothole images.
Research on a Fusion Technique of YOLOv8-URE-Based 2D Vision and Point Cloud for Robotic Grasping in Stacked Scenarios
In industrial robotic grasping tasks, traditional 3D point cloud registration and pose estimation methods often struggle with low efficiency and limited accuracy in stacked and cluttered environments. To address these challenges, this paper proposes a grasp pose estimation algorithm that integrates 2D object detection based on YOLOv8-URE with 3D point cloud registration. In the detection stage, the method enhances object feature perception and localization by optimizing the receptive field structure and introducing attention mechanisms. It also employs an efficient multi-scale feature fusion strategy to improve bounding box regression accuracy. During point cloud processing, target centers predicted by the detector guide rapid segmentation, followed by robust registration techniques to estimate precise object poses. Experimental results demonstrate that YOLOv8-URE improves detection accuracy by 9.21% compared to YOLOv8n, reduces registration time by 60.5%, and significantly increases grasp success rates, proving its reliability and effectiveness in industrial scenarios.
2D vs. 3D Change Detection Using Aerial Imagery to Support Crisis Management of Large-Scale Events
Large-scale events represent a special challenge for crisis management. To ensure that participants can enjoy an event safely and carefree, it must be comprehensively prepared and attentively monitored. Remote sensing can provide valuable information to identify potential risks and take appropriate measures in order to prevent a disaster, or initiate emergency aid measures as quickly as possible in the event of an emergency. Especially, three-dimensional (3D) information that is derived using photogrammetry can be used to analyze the terrain and map existing structures that are set up at short notice. Using aerial imagery acquired during a German music festival in 2016 and the celebration of the German Protestant Church Assembly of 2017, the authors compare two-dimensional (2D) and novel fusion-based 3D change detection methods, and discuss their suitability for supporting large-scale events during the relevant phases of crisis management. This study serves to find out what added value the use of 3D change information can provide for on-site crisis management. Based on the results, an operational, fully automatic processor for crisis management operations and corresponding products for end users can be developed.