Catalogue Search | MBRL

Visual-SLAM Classical Framework and Key Techniques: A Review

by Xu, Weiqing , Li, Xiaoying , Jia, Guanwei in Accuracy , Automation , Cameras

2022

With the significant increase in demand for artificial intelligence, environmental map reconstruction has become a research hotspot for obstacle avoidance navigation, unmanned operations, and virtual reality. The quality of the map plays a vital role in positioning, path planning, and obstacle avoidance. This review starts with the development of SLAM (Simultaneous Localization and Mapping) and proceeds to a review of V-SLAM (Visual-SLAM) from its proposal to the present, with a summary of its historical milestones. In this context, the five parts of the classic V-SLAM framework—visual sensor, visual odometer, backend optimization, loop detection, and mapping—are explained separately. Meanwhile, the details of the latest methods are shown; VI-SLAM (Visual inertial SLAM) is reviewed and extended. The four critical techniques of V-SLAM and its technical difficulties are summarized as feature detection and matching, selection of keyframes, uncertainty technology, and expression of maps. Finally, the development direction and needs of the V-SLAM field are proposed.

Journal Article

Share this book

Add to My Shelf

A Comprehensive Survey of Visual SLAM Algorithms

by Moline, Yoann , Carrel, Frédérick , Corre, Gwenolé in 3D reconstruction , Algorithms , Cameras

2022

Simultaneous localization and mapping (SLAM) techniques are widely researched, since they allow the simultaneous creation of a map and the sensors’ pose estimation in an unknown environment. Visual-based SLAM techniques play a significant role in this field, as they are based on a low-cost and small sensor system, which guarantees those advantages compared to other sensor-based SLAM techniques. The literature presents different approaches and methods to implement visual-based SLAM systems. Among this variety of publications, a beginner in this domain may find problems with identifying and analyzing the main algorithms and selecting the most appropriate one according to his or her project constraints. Therefore, we present the three main visual-based SLAM approaches (visual-only, visual-inertial, and RGB-D SLAM), providing a review of the main algorithms of each approach through diagrams and flowcharts, and highlighting the main advantages and disadvantages of each technique. Furthermore, we propose six criteria that ease the SLAM algorithm’s analysis and consider both the software and hardware levels. In addition, we present some major issues and future directions on visual-SLAM field, and provide a general overview of some of the existing benchmark datasets. This work aims to be the first step for those initiating a SLAM project to have a good perspective of SLAM techniques’ main elements and characteristics.

Journal Article

Share this book

Add to My Shelf

Visual and Visual–Inertial SLAM for UGV Navigation in Unstructured Natural Environments: A Survey of Challenges and Deep Learning Advances

by Viegas, Carlos , Soares, Salviano , Ferreira, Nuno in Algorithms , Automatic guided vehicles , Cameras

2026

Localization and mapping remain critical challenges for Unmanned Ground Vehicles (UGVs) operating in unstructured natural environments, such as forests and agricultural fields. While Visual SLAM (VSLAM) and Visual–Inertial SLAM (VI-SLAM) have matured significantly in structured and urban scenarios, their extension to outdoor natural domains introduces severe challenges, including dynamic vegetation, illumination variations, a lack of distinctive features, and degraded GNSS availability. Recent advances in Deep Learning have brought promising developments to VSLAM- and VI-SLAM-based pipelines, ranging from learned feature extraction and matching to self-supervised monocular depth prediction and differentiable end-to-end SLAM frameworks. Furthermore, emerging methods for adaptive sensor fusion, leveraging attention mechanisms and reinforcement learning, open new opportunities to improve robustness by dynamically weighting the contributions of camera and IMU measurements. This review provides a comprehensive overview of Visual and Visual–Inertial SLAM for UGVs in unstructured environments, highlighting the challenges posed by natural contexts and the limitations of current pipelines. Classic VI-SLAM frameworks and recent Deep-Learning-based approaches were systematically reviewed. Special attention is given to field robotics applications in agriculture and forestry, where low-cost sensors and robustness against environmental variability are essential. Finally, open research directions are discussed, including self-supervised representation learning, adaptive sensor confidence models, and scalable low-cost alternatives. By identifying key gaps and opportunities, this work aims to guide future research toward resilient, adaptive, and economically viable VSLAM and VI-SLAM pipelines, tailored for UGV navigation in unstructured natural environments.

Journal Article

Share this book

Add to My Shelf

Unsupervised Scale-Consistent Depth Learning from Video

by Li, Zhichao , Shen, Chunhua , Zhang, Le in Ablation , Cameras , Datasets

2021

We propose a monocular depth estimation method SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation. The source code is released on GitHub.

Journal Article

Share this book

Add to My Shelf

SLAM Overview: From Single Sensor to Heterogeneous Fusion

by Li, Zhenxiong , Hu, Kai , Shang, Guangtao in Algorithms , Cameras , Deep learning

2022

After decades of development, LIDAR and visual SLAM technology has relatively matured and been widely used in the military and civil fields. SLAM technology enables the mobile robot to have the abilities of autonomous positioning and mapping, which allows the robot to move in indoor and outdoor scenes where GPS signals are scarce. However, SLAM technology relying only on a single sensor has its limitations. For example, LIDAR SLAM is not suitable for scenes with highly dynamic or sparse features, and visual SLAM has poor robustness in low-texture or dark scenes. However, through the fusion of the two technologies, they have great potential to learn from each other. Therefore, this paper predicts that SLAM technology combining LIDAR and visual sensors, as well as various other sensors, will be the mainstream direction in the future. This paper reviews the development history of SLAM technology, deeply analyzes the hardware information of LIDAR and cameras, and presents some classical open source algorithms and datasets. According to the algorithm adopted by the fusion sensor, the traditional multi-sensor fusion methods based on uncertainty, features, and novel deep learning are introduced in detail. The excellent performance of the multi-sensor fusion method in complex scenes is summarized, and the future development of multi-sensor fusion method is prospected.

Journal Article

Share this book

Add to My Shelf

Visual SLAM: What Are the Current Trends and What to Expect?

by Sanchez-Lopez, Jose Luis , Tourani, Ali , Voos, Holger in Algorithms , Cameras , Computer Vision

2022

In recent years, Simultaneous Localization and Mapping (SLAM) systems have shown significant performance, accuracy, and efficiency gain. In this regard, Visual Simultaneous Localization and Mapping (VSLAM) methods refer to the SLAM approaches that employ cameras for pose estimation and map reconstruction and are preferred over Light Detection And Ranging (LiDAR)-based methods due to their lighter weight, lower acquisition costs, and richer environment representation. Hence, several VSLAM approaches have evolved using different camera types (e.g., monocular or stereo), and have been tested on various datasets (e.g., Technische Universität München (TUM) RGB-D or European Robotics Challenge (EuRoC)) and in different conditions (i.e., indoors and outdoors), and employ multiple methodologies to have a better understanding of their surroundings. The mentioned variations have made this topic popular for researchers and have resulted in various methods. In this regard, the primary intent of this paper is to assimilate the wide range of works in VSLAM and present their recent advances, along with discussing the existing challenges and trends. This survey is worthwhile to give a big picture of the current focuses in robotics and VSLAM fields based on the concentrated resolutions and objectives of the state-of-the-art. This paper provides an in-depth literature survey of fifty impactful articles published in the VSLAMs domain. The mentioned manuscripts have been classified by different characteristics, including the novelty domain, objectives, employed algorithms, and semantic level. The paper also discusses the current trends and contemporary directions of VSLAM techniques that may help researchers investigate them.

Journal Article

Share this book

Add to My Shelf

Street-view change detection with deconvolutional networks

by Gherardi, Riccardo , Stent, Simon , Alcantarilla, Pablo F in Autonomous navigation , Change detection , Datasets

2018

We propose a system for performing structural change detection in street-view videos captured by a vehicle-mounted monocular camera over time. Our approach is motivated by the need for more frequent and efficient updates in the large-scale maps used in autonomous vehicle navigation. Our method chains a multi-sensor fusion SLAM and fast dense 3D reconstruction pipeline, which provide coarsely registered image pairs to a deep Deconvolutional Network (DN) for pixel-wise change detection. We investigate two DN architectures for change detection, the first one is based on the idea of stacking contraction and expansion blocks while the second one is based on the idea of Fully Convolutional Networks. To train and evaluate our networks we introduce a new urban change detection dataset which is an order of magnitude larger than existing datasets and contains challenging changes due to seasonal and lighting variations. Our method outperforms existing literature on this dataset, which we make available to the community, and an existing panoramic change detection dataset, demonstrating its wide applicability.

Journal Article

Share this book

Add to My Shelf

YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint

by Wu, Wenxin , Gao, Hongli , Liu, Yuekai in Artificial Intelligence , Computational Biology/Bioinformatics , Computational Science and Engineering

2022

Simultaneous localization and mapping (SLAM), as one of the core prerequisite technologies for intelligent mobile robots, has attracted much attention in recent years. However, the traditional SLAM systems rely on the static environment assumption, which becomes unstable for the dynamic environment and further limits the real-world practical applications. To deal with the problem, this paper presents a dynamic-environment-robust visual SLAM system named YOLO-SLAM. In YOLO-SLAM, a lightweight object detection network named Darknet19-YOLOv3 is designed, which adopts a low-latency backbone to accelerate and generate essential semantic information for the SLAM system. Then, a new geometric constraint method is proposed to filter dynamic features in the detecting areas, where dynamic features can be distinguished by utilizing the depth difference with Random Sample Consensus (RANSAC). YOLO-SLAM composes the object detection approach and the geometric constraint method in a tightly coupled manner, which is able to effectively reduce the impact of dynamic objects. Experiments are conducted on the challenging dynamic sequences of TUM dataset and Bonn dataset to evaluate the performance of YOLO-SLAM. The results demonstrate that the RMSE index of absolute trajectory error can be significantly reduced to 98.13% compared with ORB-SLAM2 and 51.28% compared with DS-SLAM, indicating that YOLO-SLAM is able to effectively improve stability and accuracy in the highly dynamic environment.

Journal Article

Share this book

Add to My Shelf

GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence

by Wen-Yan, Lin , Zhang, Le , Reid, Ian in Computer vision , Computing time , Matching

2020

Feature matching aims at generating correspondences across images, which is widely used in many computer vision tasks. Although considerable progress has been made on feature descriptors and fast matching for initial correspondence hypotheses, selecting good ones from them is still challenging and critical to the overall performance. More importantly, existing methods often take a long computational time, limiting their use in real-time applications. This paper attempts to separate true correspondences from false ones at high speed. We term the proposed method (GMS) grid-based motion Statistics, which incorporates the smoothness constraint into a statistic framework for separation and uses a grid-based implementation for fast calculation. GMS is robust to various challenging image changes, involving in viewpoint, scale, and rotation. It is also fast, e.g., take only 1 or 2 ms in a single CPU thread, even when 50K correspondences are processed. This has important implications for real-time applications. What’s more, we show that incorporating GMS into the classic feature matching and epipolar geometry estimation pipeline can significantly boost the overall performance. Finally, we integrate GMS into the well-known ORB-SLAM system for monocular initialization, resulting in a significant improvement.

Journal Article

Share this book

Add to My Shelf

A survey on vision-based UAV navigation

by Zhang, Liangpei , Xue, Zhucun , Xia, Gui-Song in Autonomous navigation , Computer vision , Global positioning systems

2018

Research on unmanned aerial vehicles (UAV) has been increasingly popular in the past decades, and UAVs have been widely used in industrial inspection, remote sensing for mapping & surveying, rescuing, and so on. Nevertheless, the limited autonomous navigation capability severely hampers the application of UAVs in complex environments, such as GPS-denied areas. Previously, researchers mainly focused on the use of laser or radar sensors for UAV navigation. With the rapid development of computer vision, vision-based methods, which utilize cheaper and more flexible visual sensors, have shown great advantages in the field of UAV navigation. The purpose of this article is to present a comprehensive literature review of the vision-based methods for UAV navigation. Specifically on visual localization and mapping, obstacle avoidance and path planning, which compose the essential parts of visual navigation. Furthermore, throughout this article, we will have an insight into the prospect of the UAV navigation and the challenges to be faced.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter