Catalogue Search | MBRL

Vis-MVSNet: Visibility-Aware Multi-view Stereo Network

by Fang, Tian , Luo, Zixin , Yao, Yao in Cameras , Datasets , Occlusion

2023

Learning-based multi-view stereo (MVS) methods have demonstrated promising results. However, very few existing networks explicitly take the pixel-wise visibility into consideration, resulting in erroneous cost aggregation from occluded pixels. In this paper, we explicitly infer and integrate the pixel-wise occlusion information in the MVS network via the matching uncertainty estimation. The pair-wise uncertainty map is jointly inferred with the pair-wise depth map, which is further used as weighting guidance during the multi-view cost volume fusion. As such, the adverse influence of occluded pixels is suppressed in the cost fusion. The proposed framework Vis-MVSNet significantly improves depth accuracy in reconstruction scenes with severe occlusion. Extensive experiments are performed on DTU, BlendedMVS, Tanks and Temples and ETH3D datasets to justify the effectiveness of the proposed framework.

Journal Article

Share this book

Add to My Shelf

Research on Multi-View Stereo Network Based on Self-Attention Mechanism

by Hu, Zhiyi , Li, Wenkai , Yu, Jun in 3D Reconstruction , Accuracy , Algorithms

2025

As the technologies of virtual reality and augmented reality rapidly advance, the demand for high-quality 3D models has been growing exponentially. However, the Multi-View Stereo Network (MVSNet) for 3D reconstruction has faced issues with the inaccurate extraction of global image information and depth cues. In response to these challenges, this paper presents enhancements to MVSNet. First, the self-attention mechanism is introduced to enhance MVSNet's ability to capture global information in images. Second, a residual structure is added to mitigate the accuracy loss caused by the downsampling and upsampling of feature maps during the regularization process of cost volume, thus ensuring the integrity of information and transmission efficiency. Experimental results indicate that, in comparison with the original MVSNet, the SelfRes-MVSNet reduces the error rate by 1.3% in terms of overall accuracy and completeness, thereby improving the reconstruction effect from 2D images to 3D models.

Journal Article

Share this book

Add to My Shelf

Multi-View Stereo Using Perspective-Aware Features and Metadata to Improve Cost Volume

by Mo, Fan , Zhou, Yu , Li, Yuanxiang in 3D reconstruction , Artificial intelligence , Comparative analysis

2025

Feature matching is pivotal when using multi-view stereo (MVS) to reconstruct dense 3D models from calibrated images. This paper proposes PAC-MVSNet, which integrates perspective-aware convolution (PAC) and metadata-enhanced cost volumes to address the challenges in reflective and texture-less regions. PAC dynamically aligns convolutional kernels with scene perspective lines, while the use of metadata (e.g., camera pose distance) enables geometric reasoning during cost aggregation. In PAC-MVSNet, we introduce feature matching with long-range tracking that utilizes both internal and external focuses to integrate extensive contextual data within individual images as well as across multiple images. To enhance the performance of the feature matching with long-range tracking, we also propose a perspective-aware convolution module that directs the convolutional kernel to capture features along the perspective lines. This enables the module to extract perspective-aware features from images, improving the feature matching. Finally, we crafted a specific 2D CNN that fuses image priors, thereby integrating keyframes and geometric metadata within the cost volume to evaluate depth planes. Our method represents the first attempt to embed the existing physical model knowledge into a network for completing MVS tasks, which achieved optimal performance using multiple benchmark datasets.

Journal Article

Share this book

Add to My Shelf

DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo

by Zhang, Lili , Wang, Yang , Wei, Zhiwei in Complexity , Computational Intelligence , Data Structures and Information Theory

2023

Deep learning has recently been proven to deliver excellent performance in multi-view stereo (MVS). However, it is difficult for deep learning-based MVS approaches to balance their efficiency and effectiveness. Towards this end, we propose the DSC-MVSNet, a novel coarse-to-fine and end-to-end framework for more efficient and more accurate depth estimation in MVS. In particular, we propose an attention aware 3D UNet-shape network, which first uses the depthwise separable convolutions for cost volume regularization. This mechanism enables effective aggregation of information and significantly reduces the model parameters and computation by transforming the ordinary convolution on cost volume as depthwise convolution and pointwise convolution. Besides, a 3D-Attention module is proposed to alleviate the feature mismatching problem in cost volume regularization and aggregate the important information of cost volume in three dimensions (i.e. channel, space, and depth). Moreover, we propose an efficient Feature Transfer Module to upsample the low-resolution (LR) depth map to a high-resolution (HR) depth map to achieve higher accuracy. With extensive experiments on two benchmark datasets, i.e. DTU and Tanks & Temples, we demonstrate that the parameters of our model are significantly reduced to 25 % of the state-of-the-art model MVSNet. Besides, our method outperforms or maintains on par accuracy with the state-of-the-art models. Our source code is available at https://github.com/zs670980918/DSC-MVSNet .

Journal Article

Share this book

Add to My Shelf

FS-MVSNet: A Multi-View Image-Based Framework for 3D Forest Reconstruction and Parameter Extraction of Single Trees

by Dai, Lingnan , Chen, Zhao , Guo, Qian in Accuracy , Algorithms , Climate change

2025

With the rapid advancement of smart forestry, 3D reconstruction and the extraction of structural parameters have emerged as indispensable tools in modern forest monitoring. Although traditional methods involving LiDAR and manual surveys remain effective, they often entail considerable operational complexity and fluctuating costs. To provide a cost-effective and scalable alternative, this study introduces FS-MVSNet—a multi-view image-based 3D reconstruction framework incorporating feature pyramid structures and attention mechanisms. Field experiments were performed in three representative forest parks in Beijing, characterized by open canopies and minimal understory, creating the optimal conditions for photogrammetric reconstruction. The proposed workflow encompasses near-ground image acquisition, image preprocessing, 3D reconstruction, and parameter estimation. FS-MVSNet resulted in an average increase in point cloud density of 149.8% and 22.6% over baseline methods, and facilitated robust diameter at breast height (DBH) estimation through an iterative circle-fitting strategy. Across four sample plots, the DBH estimation accuracy surpassed 91%, with mean improvements of 3.14% in AE, 1.005 cm in RMSE, and 3.64% in rRMSE. Further evaluations on the DTU dataset validated the reconstruction quality, yielding scores of 0.317 mm for accuracy, 0.392 mm for completeness, and 0.372 mm for overall performance. The proposed method demonstrates strong potential for low-cost and scalable forest surveying applications. Future research will investigate its applicability in more structurally complex and heterogeneous forest environments, and benchmark its performance against state-of-the-art LiDAR-based workflows.

Journal Article

Share this book

Add to My Shelf

Multi-View Three-Dimensional Reconstruction Based on Feature Enhancement and Weight Optimization Network

by Yu, Qian , Wei, Min , Wang, Ziheng in 3D reconstruction , Accuracy , Adaptability

2025

Aiming to address the issue that existing multi-view stereo reconstruction methods have insufficient adaptability to the repetitive and weak textures in multi-view images, this paper proposes a three-dimensional (3D) reconstruction algorithm based on Feature Enhancement and Weight Optimization MVSNet (Abbreviated as FEWO-MVSNet). To obtain accurate and detailed global and local features, we first develop an adaptive feature enhancement approach to obtain multi-scale information from the images. Second, we introduce an attention mechanism and a spatial feature capture module to enable high-sensitivity detection for weak texture features. Third, based on the 3D convolutional neural network, the fine depth map for multi-view images can be predicted and the complete 3D model is subsequently reconstructed. Last, we evaluated the proposed FEWO-MVSNet through training and testing on the DTU, BlendedMVS, and Tanks and Temples datasets. The results demonstrate significant superiorities of our method for 3D reconstruction from multi-view images, with our method ranking first in accuracy and second in completeness when compared to the existing representative methods.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter