Catalogue Search | MBRL

Details preserved unsupervised depth estimation by fusing traditional stereo knowledge from laparoscopic images

by Luo, Huoling , Hu, Qingmao , Jia, Fucang in Algorithms , confidence measure , constrain neighbouring pixels

2019

Depth estimation plays an important role in vision-based laparoscope surgical navigation systems. Most learning-based depth estimation methods require ground truth depth or disparity images for training; however, these data are difficult to obtain in laparoscopy. The authors present an unsupervised learning depth estimation approach by fusing traditional stereo knowledge. The traditional stereo method is used to generate proxy disparity labels, in which unreliable depth measurements are removed via a confidence measure to improve stereo accuracy. The disparity images are generated by training a dual encoder–decoder convolutional neural network from rectified stereo images coupled with proxy labels generated by the traditional stereo method. A principled mask is computed to exclude the pixels, which are not seen in one of views due to parallax effects from the calculation of loss function. Moreover, the neighbourhood smoothness term is employed to constrain neighbouring pixels with similar appearances to generate a smooth depth surface. This approach can make the depth of the projected point cloud closer to the real surgical site and preserve realistic details. The authors demonstrate the performance of the method by training and evaluation with a partial nephrectomy da Vinci surgery dataset and heart phantom data from the Hamlyn Centre.

Journal Article

Share this book

Add to My Shelf

Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior

by Yu, Yanjiang , Li, Changsheng , Liu, Wei in Algorithms , Computer vision , Datasets

2022

Rain is a common natural phenomenon. Taking images in the rain however often results in degraded quality of images, thus compromises the performance of many computer vision systems. Most existing de-rain algorithms use only one single input image and aim to recover a clean image. Few work has exploited stereo images. Moreover, even for single image based monocular deraining, many current methods fail to complete the task satisfactorily because they mostly rely on per pixel loss functions and ignore semantic information. In this paper, we present a Paired Rain Removal Network (PRRNet), which exploits both stereo images and semantic information. Specifically, we develop a Semantic-Aware Deraining Module (SADM) which solves both tasks of semantic segmentation and deraining of scenes, and a Semantic-Fusion Network (SFNet) and a View-Fusion Network (VFNet) which fuse semantic information and multi-view information respectively. In addition, we also introduce an Enhanced Paired Rain Removal Network (EPRRNet) which exploits semantic prior to remove rain streaks from stereo images. We first use a coarse deraining network to reduce the rain streaks on the input images, and then adopt a pre-trained semantic segmentation network to extract semantic features from the coarse derained image. Finally, a parallel stereo deraining network fuses semantic and multi-view information to restore finer results. We also propose new stereo based rainy datasets for benchmarking. Experiments on both monocular and the newly proposed stereo rainy datasets demonstrate that the proposed method achieves the state-of-the-art performance. https://github.com/HDCVLab/Stereo-Image-Deraining.

Journal Article

Share this book

Add to My Shelf

Colour correction of stereo images using local correspondence

by Lee, In-Kwon , Lee, Yong-Ho in Algorithms , Applied sciences , Artificial intelligence

2014

A method for enhancing the accuracy of colour correction in stereo images is proposed. The goal is to perform colour compensation at the corresponding points in a stereo image pair using local correspondence information. First, the relevant matching points for colour transfer are extracted using a modified stereo matching algorithm. Then, in order to account for illumination variations and occlusion areas, a new colour transfer method is applied through the weighted sum of the colour difference using local features. The performance of the proposed algorithm has been tested on the reference colour-modified Middlebury dataset.

Journal Article

Share this book

Add to My Shelf

Stereo sample generation‐based domain generalization network for stereo matching

by Zhang, Zhe , Xu, Liying , Peng, Bo in computer vision , stereo image processing

2024

Recently, deep learning‐based stereo matching has achieved great success. However, models trained on the source domain dataset encounter substantial performance degradation when directly tested on an unseen target domain dataset because of neglecting the generalization to out‐of‐distribution (OOD) stereo samples. This paper proposes a stereo sample generation‐based domain generalization network (SGDG‐Net) for stereo matching. Specifically, to expand the distribution span of training samples, OOD stereo samples are generated to assist training. To effectively generate OOD left samples, a style transfer‐based generation mechanism is proposed to transmit perturbations to the source left samples. In addition, to generate the OOD right samples, a disparity‐assisted generation strategy is proposed by using disparity map labels as auxiliary information. Experimental results demonstrate that the proposed SGDG‐Net produces remarkable results on four benchmark datasets. This paper proposes a stereo sample generation‐based domain generalization network (SGDG‐Net) for stereo matching. Specifically, to effectively generate OOD left samples, a style transfer‐based generation mechanism is proposed to transmit perturbations to the source left samples. In addition, to generate the OOD right samples, a disparity‐assisted generation strategy is proposed by using disparity map labels as auxiliary information.

Journal Article

Share this book

Add to My Shelf

Enhanced blur‐robust monocular depth estimation via self‐supervised learning

by Kim, Seong‐Yeol , Shin, Ho‐Ju , Lee, Se‐Ho in computer vision , Image and Vision Processing and Display Technology , image processing

2024

This letter presents a novel self‐supervised learning strategy to improve the robustness of a monocular depth estimation (MDE) network against motion blur. Motion blur, a common problem in real‐world applications like autonomous driving and scene reconstruction, often hinders accurate depth perception. Conventional MDE methods are effective under controlled conditions but struggle to generalise their performance to blurred images. To address this problem, we generate blur‐synthesised data to train a robust MDE model without the need for preprocessing, such as deblurring. By incorporating self‐distillation techniques and using blur‐synthesised data, the depth estimation accuracy for blurred images is significantly enhanced without additional computational or memory overhead. Extensive experimental results demonstrate the effectiveness of the proposed method, enhancing existing MDE models to accurately estimate depth information across various blur conditions. This study introduces a novel self‐supervised learning approach for enhancing the robustness of monocular depth estimation networks against motion blur. By leveraging blur‐synthesised data and self‐distillation techniques, our method significantly improves depth estimation accuracy for blurred images without extra computational or memory costs. Extensive experiments validate the proposed strategy, showing its ability to accurately estimate depth across various blur conditions.

Journal Article

Share this book

Add to My Shelf

Improved seam carving for stereo image resizing

by Yue, Bin , Hou, Chun-ping , Zhou, Yuan in Algorithms , Communications Engineering , Consistency

2013

When stereo images are shown in three-dimensional (3D) display devices of different aspect ratios, the resizing algorithm for single image could lead to shape and depth distortion of the stereo image’s main content. This paper aims to propose a novel method for retargeting stereo image pairs without distorting important objects in the scene while still maintaining the consistency between the left and right images. We extended seam carving algorithm to stereo images. The novelty of our method is that important objects are determined by jointly considering the intensities of gradients and visual fusion area. The retargeted stereo pair has a feasible 3D interpretation that is similar to the original one. Our method protected the important content and reduced the visual distortion in each of the images as well as the depth distortion. Experimental results are presented to demonstrate that the proposed method effectively guaranteed the geometric consistency of resized stereo images.

Journal Article

Share this book

Add to My Shelf

Comprehensive Bird Preservation at Wind Farms

by Kaniecki, Damian , Gradolewski, Dawid , Jaworski, Adam in Aircraft detection , Airports , algorithm

2021

Wind as a clean and renewable energy source has been used by humans for centuries. However, in recent years with the increase in the number and size of wind turbines, their impact on avifauna has become worrisome. Researchers estimated that in the U.S. up to 500,000 birds die annually due to collisions with wind turbines. This article proposes a system for mitigating bird mortality around wind farms. The solution is based on a stereo-vision system embedded in distributed computing and IoT paradigms. After a bird’s detection in a defined zone, the decision-making system activates a collision avoidance routine composed of light and sound deterrents and the turbine stopping procedure. The development process applies a User-Driven Design approach along with the process of component selection and heuristic adjustment. This proposal includes a bird detection method and localization procedure. The bird identification is carried out using artificial intelligence algorithms. Validation tests with a fixed-wing drone and verifying observations by ornithologists proved the system’s desired reliability of detecting a bird with wingspan over 1.5 m from at least 300 m. Moreover, the suitability of the system to classify the size of the detected bird into one of three wingspan categories, small, medium and large, was confirmed.

Journal Article

Share this book

Add to My Shelf

Classification of Land Cover, Forest, and Tree Species Classes with ZiYuan-3 Multispectral and Stereo Data

by Chen, Yaoliang , Chen, Erxue , Xie, Zhuli in Algorithms , Artificial intelligence , Artificial neural networks

2019

The global availability of high spatial resolution images makes mapping tree species distribution possible for better management of forest resources. Previous research mainly focused on mapping single tree species, but information about the spatial distribution of all kinds of trees, especially plantations, is often required. This research aims to identify suitable variables and algorithms for classifying land cover, forest, and tree species. Bi-temporal ZiYuan-3 multispectral and stereo images were used. Spectral responses and textures from multispectral imagery, canopy height features from bi-temporal stereo imagery, and slope and elevation from the stereo-derived digital surface model data were examined through comparative analysis of six classification algorithms including maximum likelihood classifier (MLC), k-nearest neighbor (kNN), decision tree (DT), random forest (RF), artificial neural network (ANN), and support vector machine (SVM). The results showed that use of multiple source data—spectral bands, vegetation indices, textures, and topographic factors—considerably improved land-cover and forest classification accuracies compared to spectral bands alone, which the highest overall accuracy of 84.5% for land cover classes was from the SVM, and, of 89.2% for forest classes, was from the MLC. The combination of leaf-on and leaf-off seasonal images further improved classification accuracies by 7.8% to 15.0% for land cover classes and by 6.0% to 11.8% for forest classes compared to single season spectral image. The combination of multiple source data also improved land cover classification by 3.7% to 15.5% and forest classification by 1.0% to 12.7% compared to the spectral image alone. MLC provided better land-cover and forest classification accuracies than machine learning algorithms when spectral data alone were used. However, some machine learning approaches such as RF and SVM provided better performance than MLC when multiple data sources were used. Further addition of canopy height features into multiple source data had no or limited effects in improving land-cover or forest classification, but improved classification accuracies of some tree species such as birch and Mongolia scotch pine. Considering tree species classification, Chinese pine, Mongolia scotch pine, red pine, aspen and elm, and other broadleaf trees as having classification accuracies of over 92%, and larch and birch have relatively low accuracies of 87.3% and 84.5%. However, these high classification accuracies are from different data sources and classification algorithms, and no one classification algorithm provided the best accuracy for all tree species classes. This research implies the same data source and the classification algorithm cannot provide the best classification results for different land cover classes. It is necessary to develop a comprehensive classification procedure using an expert-based approach or hierarchical-based classification approach that can employ specific data variables and algorithm for each tree species class.

Journal Article

Share this book

Add to My Shelf

Learning a convolutional neural network for propagation-based stereo image segmentation

by Li, Xujie , Wang, Yandan , Zhao, Hanli in Algorithms , Artificial Intelligence , Artificial neural networks

2020

Stereo image segmentation is the key technology in stereo image editing with the population of stereoscopic 3D media. Most previous methods perform stereo image segmentation on both views relying primarily on per-pixel disparities, which results in the segmentation quality closely connected to the accuracy of the disparities. Therefore, a mechanism to remove the errors of the disparities are highly demanded. To date, there’s no such a method yet that can produce accurate disparity maps. In this paper, we propose a novel convolutional neural network (CNN)-based framework, which will automatically propagate the segmentation result from one view to the other. The key problem of accurate stereo image segmentation is the missing of occluded regions. To solve this problem, the CNN architecture is proposed to improve the stereo segmentation performance. In order to address the inevitable inaccuracies problem of the disparities computed from a stereo pair of images, we utilize the coherent disparity propagation that propagates segment result via those pixels with coherent disparities. The pixels by coherent disparity propagation and the high confidence pixels of the object probability map produced by the CNN architecture are then used to generate the initial reliable pixels to perform an energy minimization framework-based segmentation. A comprehensive evaluations and comparisons on Middlebury and Adobe benchmark datasets show the effectiveness of our proposed method in terms of high-quality results, and the robustness against various types of inputs.

Journal Article

Share this book

Add to My Shelf

Multi-UAV-based stereo vision system without GPS for ground obstacle mapping to assist path planning of UGV

by Kim, Jin Hyo , Seo, Jiwon , Kwon, Ji-Wook in altitude estimation , autonomous aerial vehicles , collision avoidance

2014

A multi-unmanned aerial vehicle (UAV)-based stereo vision system is proposed to assist global path planning of an unmanned ground vehicle (UGV) even in GPS-denied environments. The proposed system can optimally generate the depth map of ground objects and robustly detect obstacles. The proposed multi-UAV-based system with a movable baseline overcomes the limitations of a single-UAV-based stereo vision system with a fixed baseline. Thus, the performance of the proposed system does not degrade significantly based on the altitude of UAVs. The relative position and altitude estimation, multi-agent formation control and image processing techniques are considered to implement a prototype system. The experimental results demonstrate the performance of the implemented system for various baseline conditions between UAVs.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter