Catalogue Search | MBRL

The Unmanned Aerial Vehicle Benchmark: Object Detection, Tracking and Baseline

by Zhang, Weigang , Sebe Nicu , Li, Guorong in Benchmarks , Computer vision , Datasets

2020

With the increasing popularity of Unmanned Aerial Vehicles (UAVs) in computer vision-related applications, intelligent UAV video analysis has recently attracted the attention of an increasing number of researchers. To facilitate research in the UAV field, this paper presents a UAV dataset with 100 videos featuring approximately 2700 vehicles recorded under unconstrained conditions and 840k manually annotated bounding boxes. These UAV videos were recorded in complex real-world scenarios and pose significant new challenges, such as complex scenes, high density, small objects, and large camera motion, to the existing object detection and tracking methods. These challenges have encouraged us to define a benchmark for three fundamental computer vision tasks, namely, object detection, single object tracking (SOT) and multiple object tracking (MOT), on our UAV dataset. Specifically, our UAV benchmark facilitates evaluation and detailed analysis of state-of-the-art detection and tracking methods on the proposed UAV dataset. Furthermore, we propose a novel approach based on the so-called Context-aware Multi-task Siamese Network (CMSN) model that explores new cues in UAV videos by judging the consistency degree between objects and contexts and that can be used for SOT and MOT. The experimental results demonstrate that our model could make tracking results more robust in both SOT and MOT, showing that the current tracking and detection methods have limitations in dealing with the proposed UAV benchmark and that further research is indeed needed.

Journal Article

Share this book

Add to My Shelf

Comparing State-of-the-Art Deep Learning Algorithms for the Automated Detection and Tracking of Black Cattle

by Kobayashi, Ikuo , Tin, Pyke , Zin, Thi Thi in Accuracy , Algorithms , Automation

2023

Effective livestock management is critical for cattle farms in today’s competitive era of smart modern farming. To ensure farm management solutions are efficient, affordable, and scalable, the manual identification and detection of cattle are not feasible in today’s farming systems. Fortunately, automatic tracking and identification systems have greatly improved in recent years. Moreover, correctly identifying individual cows is an integral part of predicting behavior during estrus. By doing so, we can monitor a cow’s behavior, and pinpoint the right time for artificial insemination. However, most previous techniques have relied on direct observation, increasing the human workload. To overcome this problem, this paper proposes the use of state-of-the-art deep learning-based Multi-Object Tracking (MOT) algorithms for a complete system that can automatically and continuously detect and track cattle using an RGB camera. This study compares state-of-the-art MOTs, such as Deep-SORT, Strong-SORT, and customized light-weight tracking algorithms. To improve the tracking accuracy of these deep learning methods, this paper presents an enhanced re-identification approach for a black cattle dataset in Strong-SORT. For evaluating MOT by detection, the system used the YOLO v5 and v7, as a comparison with the instance segmentation model Detectron-2, to detect and classify the cattle. The high cattle-tracking accuracy with a Multi-Object Tracking Accuracy (MOTA) was 96.88%. Using these methods, the findings demonstrate a highly accurate and robust cattle tracking system, which can be applied to innovative monitoring systems for agricultural applications. The effectiveness and efficiency of the proposed system were demonstrated by analyzing a sample of video footage. The proposed method was developed to balance the trade-off between costs and management, thereby improving the productivity and profitability of dairy farms; however, this method can be adapted to other domestic species.

Journal Article

Share this book

Add to My Shelf

Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

by Muresan, Mircea Paul , Giosan, Ion , Nedevschi, Sergiu in data association , motion compensation , multi-object tracking

2020

The stabilization and validation process of the measured position of objects is an important step for high-level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super-sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long-range RADAR (Radio detection and ranging), and 4-layer and 16-layer LIDARs (Light Detection and Ranging). We propose two original data association methods used in the sensor fusion and tracking processes. The first data association algorithm is created for tracking LIDAR objects and combines multiple appearance and motion features in order to exploit the available information for road objects. The second novel data association algorithm is designed for trifocal camera objects and has the objective of finding measurement correspondences to sensor fused objects such that the super-sensor data are enriched by adding the semantic class information. The implemented trifocal object association solution uses a novel polar association scheme combined with a decision tree to find the best hypothesis–measurement correlations. Another contribution we propose for stabilizing object position and unpredictable behavior of road objects, provided by multiple types of complementary sensors, is the use of a fusion approach based on the Unscented Kalman Filter and a single-layer perceptron. The last novel contribution is related to the validation of the 3D object position, which is solved using a fuzzy logic technique combined with a semantic segmentation image. The proposed algorithms have a real-time performance, achieving a cumulative running time of 90 ms, and have been evaluated using ground truth data extracted from a high-precision GPS (global positioning system) with 2 cm accuracy, obtaining an average error of 0.8 m.

Journal Article

Share this book

Add to My Shelf

Visual Object Tracking in First Person Vision

by Farinella, Giovanni Maria , Micheloni, Christian , Dunnhofer, Matteo in Algorithms , Computer vision , Optical tracking

2023

The understanding of human-object interactions is fundamental in First Person Vision (FPV). Visual tracking algorithms which follow the objects manipulated by the camera wearer can provide useful information to effectively model such interactions. In the last years, the computer vision community has significantly improved the performance of tracking algorithms for a large variety of target objects and scenarios. Despite a few previous attempts to exploit trackers in the FPV domain, a methodical analysis of the performance of state-of-the-art trackers is still missing. This research gap raises the question of whether current solutions can be used “off-the-shelf” or more domain-specific investigations should be carried out. This paper aims to provide answers to such questions. We present the first systematic investigation of single object tracking in FPV. Our study extensively analyses the performance of 42 algorithms including generic object trackers and baseline FPV-specific trackers. The analysis is carried out by focusing on different aspects of the FPV setting, introducing new performance measures, and in relation to FPV-specific tasks. The study is made possible through the introduction of TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences. Our results show that object tracking in FPV poses new challenges to current visual trackers. We highlight the factors causing such behavior and point out possible research directions. Despite their difficulties, we prove that trackers bring benefits to FPV downstream tasks requiring short-term object tracking. We expect that generic object tracking will gain popularity in FPV as new and FPV-specific methodologies are investigated.

Journal Article

Share this book

Add to My Shelf

Text-Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue

by Liu, Hengzhu , Lin, Zhipeng , Wang, Haotian in Algorithms , Burst tests , Color imagery

2024

The rapid development of remote sensing technology has provided new sources of data for marine rescue and has made it possible to find and track survivors. Due to the requirement of tracking multiple survivors at the same time, multi-object tracking (MOT) has become the key subtask of marine rescue. However, there exists a significant gap between fine-grained objects in realistic marine rescue remote sensing data and the fine-grained object tracking capability of existing MOT technologies, which mainly focuses on coarse-grained object scenarios and fails to track fine-grained instances. Such a gap limits the practical application of MOT in realistic marine rescue remote sensing data, especially when rescue forces are limited. Given the promising fine-grained classification performance of recent text-guided methods, we delve into leveraging labels and attributes to narrow the gap between MOT and fine-grained maritime rescue. We propose a text-guided multi-class multi-object tracking (TG-MCMOT) method. To handle the problem raised by fine-grained classes, we design a multi-modal encoder by aligning external textual information with visual inputs. We use decoding information at different levels, simultaneously predicting the category, location, and identity embedding features of objects. Meanwhile, to improve the performance of small object detection, we also develop a data augmentation pipeline to generate pseudo-near-infrared images based on RGB images. Extensive experiments demonstrate that our TG-MCMOT not only performs well on typical metrics in the maritime rescue task (SeaDronesSee dataset), but it also effectively tracks open-set categories on the BURST dataset. Specifically, on the SeaDronesSee dataset, the Higher Order Tracking Accuracy (HOTA) reached a score of 58.8, and on the BURST test dataset, the HOTA score for the unknown class improved by 16.07 points.

Journal Article

Share this book

Add to My Shelf

Confidence-Guided Frame Skipping to Enhance Object Tracking Speed

by Lee, Yun Gu in Accuracy , Algorithms , Computer vision

2024

Object tracking is a challenging task in computer vision. While simple tracking methods offer fast speeds, they often fail to track targets. To address this issue, traditional methods typically rely on complex algorithms. This study presents a novel approach to enhance object tracking speed via confidence-guided frame skipping. The proposed method is strategically designed to complement existing methods. Initially, lightweight tracking is used to track a target. Only in scenarios where it fails to track is an existing, robust but complex algorithm used. The contribution of this study lies in the proposed confidence assessment of the lightweight tracking’s results. The proposed method determines the need for intervention by the robust algorithm based on the predicted confidence level. This two-tiered approach significantly enhances tracking speed by leveraging the lightweight method for straightforward situations and the robust algorithm for challenging scenarios. Experimental results demonstrate the effectiveness of the proposed approach in enhancing tracking speed.

Journal Article

Share this book

Add to My Shelf

Deep Trajectory Post-Processing and Position Projection for Single & Multiple Camera Multiple Object Tracking

by Li, Yuan , Ma, Cong , Gao, Wen in Anomalies , Cameras , Multiple target tracking

2021

Multiple Object Tracking (MOT) has attracted increasing interests in recent years, which plays a significant role in video analysis. MOT aims to track the specific targets as whole trajectories and locate the positions of the trajectory at different times. These trajectories are usually applied in Action Recognition, Anomaly Detection, Crowd Analysis and Multiple-Camera Tracking, etc. However, existing methods are still a challenge in complex scene. Generating false (impure, incomplete) tracklets directly affects the performance of subsequent tasks. Therefore, we propose a novel architecture, Siamese Bi-directional GRU, to construct Cleaving Network and Re-connection Network as trajectory post-processing. Cleaving Network is able to split the impure tracklets as several pure sub-tracklets, and Re-connection Network aims to re-connect the tracklets which belong to same person as whole trajectory. In addition, our methods are extended to Multiple-Camera Tracking, however, current methods rarely consider the spatial-temporal constraint, which increases redundant trajectory matching. Therefore, we present Position Projection Network (PPN) to convert trajectory position from local camera-coordinate to global world-coodrinate, which provides adequate and accurate temporal-spatial information for trajectory association. The proposed technique is evaluated over two widely used datasets MOT16 and Duke-MTMCT, and experiments demonstrate its superior effectiveness as compared with the state-of-the-arts.

Journal Article

Share this book

Add to My Shelf

A Review of Multi‐Object Tracking in Recent Times

by Ren, Hengyi , Li, Suya , Xie, Xin in Accuracy , computer graphics , Computer vision

2025

Multi‐object tracking (MOT) is a fundamental problem in computer vision that involves tracing the trajectories of foreground targets throughout a video sequence while establishing correspondences for identical objects across frames. With the advancement of deep learning techniques, methods based on deep learning have significantly improved accuracy and efficiency in MOT. This paper reviews several recent deep learning‐based MOT methods and categorises them into three main groups: detection‐based, single‐object tracking (SOT)‐based, and segmentation‐based methods, according to their core technologies. Additionally, this paper discusses the metrics and datasets used for evaluating MOT performance, the challenges faced in the field, and future directions for research. This paper discusses many recent deep‐learning MOT methods. Moreover, to highlight their contributions, these methods are categorised into four main groups: detection‐based, SOT‐based, and segmentation‐based methods according to the integrated core technologies.

Journal Article

Share this book

Add to My Shelf

Towards Reliable Identification and Tracking of Drones Within a Swarm

by Lee, Kevin , Barca, Jan Carlo , Ranaweera, Chathurika in Artificial Intelligence , Collision avoidance , Control

2024

Drone swarms consist of multiple drones that can achieve tasks that individual drones can not, such as search and recovery or surveillance over a large area. A swarm’s internal structure typically consists of multiple drones operating autonomously. Reliable detection and tracking of swarms and individual drones allow a greater understanding of the behaviour and movement of a swarm. Increased understanding of drone behaviour allows better coordination, collision avoidance, and performance monitoring of individual drones in the swarm. The research presented in this paper proposes a deep learning-based approach for reliable detection and tracking of individual drones within a swarm using stereo-vision cameras in real time. The motivation behind this research is in the need to gain a deeper understanding of swarm dynamics, enabling improved coordination, collision avoidance, and performance monitoring of individual drones within a swarm. The proposed solution provides a precise tracking system and considers the highly dense and dynamic behaviour of drones. The approach is evaluated in both sparse and dense networks in a variety of configurations. The accuracy and efficiency of the proposed solution have been analysed by implementing a series of comparative experiments that demonstrate reasonable accuracy in detecting and tracking drones within a swarm.

Journal Article

Share this book

Add to My Shelf

Object tracking on event cameras with offline–online learning

by Shi, Shunshun , Chen, Shoushun , Dong, Meng in Algorithms , Automation , Cameras

2020

Compared with conventional image sensors, event cameras have been attracting attention thanks to their potential in environments under fast motion and high dynamic range (HDR). To tackle the lost-track issue due to fast illumination changes under HDR scene such as tunnels, an object tracking framework has been presented based on event count images from an event camera. The framework contains an offline-trained detector and an online-trained tracker which complement each other: The detector benefits from pre-labelled data during training, but may have false or missing detections; the tracker provides persistent results for each initialised object but may suffer from drifting issues or even failures. Besides, process and measurement equations have been modelled, and a Kalman fusion scheme has been proposed to incorporate measurements from the detector and the tracker. Self-initialisation and track maintenance in the fusion scheme ensure autonomous real-time tracking without user intervene. With self-collected event data in urban driving scenarios, experiments have been conducted to show the performance of the proposed framework and the fusion scheme.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter