Review of multi-view stereo reconstruction methods based on deep learning

被引:1
|
作者
Yan H. [1 ]
Xu F. [1 ]
Huang L. [2 ]
Liu C. [1 ]
Lin C. [1 ]
机构
[1] School of Science, Jiangxi University of Science and Technology, Ganzhou
[2] School of Electrical Engineering and Automation, Jiangxi University of Science and Technology, Ganzhou
关键词
3D reconstruction; deep learning; depth estimation; homography transformation; multi-view stereo;
D O I
10.37188/OPE.20233116.2444
中图分类号
学科分类号
摘要
The goal of Multi-view stereo(MVS)Reconstruction is to reconstruct a 3D model of a scene based on a set of multi-view images with known camera parameters,which is a mainstream method of 3D reconstruction in recent years. This paper provides a algorithm evaluation comparison for the latest hundreds of MVS methods based on deep learning. First,we sorted out the existing supervised learning-based MVS methods according to the reconstruction process of feature extraction,cost volume construction,cost volume regularization and depth regression,focusing on the summary of improvement strategies in the two stages of cost volume construction and cost volume regularization. For the unsupervised MVS methods,we mainly analyzed the design of the loss terms of each algorithm. It is classified according to its training mode. Secondly,we summarized the common datasets of MVS methods and their corresponding performance evaluation indexes,and further studied the introduction of strategies such as feature pyramid network,attention mechanism,coarse-to-fine strategy on the performance of MVS networks. In addition,it introduced the specific application scenarios of MVS methods,including digital twin,autonomous driving,robotics,heritage conservation,bioscience and other fields. Finally,we made some suggestions for the improvement direction of MVS methods,and also discussed the future technical difficulties and the research directions of MVS 3D reconstruction. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:2444 / 2464
页数:20
相关论文
共 124 条
  • [1] FURUKAWA Y, HERNANDEZ C., Multi-view stereo:a tutorial[J], Foundations and Trends® in Computer Graphics and Vision, 9, 1/2, pp. 1-148, (2015)
  • [2] SMITH M W, QUINCEY D J., Structure from motion photogrammetry in physical geography[J], Progress in Physical Geography:Earth and Environment, 40, 2, pp. 247-275, (2016)
  • [3] LIU D S, CHEN J L, Et al., Three-dimensional reconstruction of large-scale scene based on depth camera[J], Opt. Precision Eng, 28, 1, pp. 234-243, (2020)
  • [4] Computer Vision - ECCV 2016, pp. 501-518, (2016)
  • [5] TAO W B., Multi-scale geometric consistency guided multi-view stereo[C], 2019 IEEE/ CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 5478-5487, (2019)
  • [6] ZHANG B X, YU Z M, YANG Q H., Research on 3D reconstruction of microscope imaging based on Harris-SIFT algorithm and full convolution depth prediction[J], Opt. Precision Eng, 30, 14, pp. 1669-1681, (2022)
  • [7] JI M Q,, GALL J,, ZHENG H T,, Et al., SurfaceNet:an End-to-End 3D neural network for multiview stereopsis[C], 2017 IEEE International Con⁃ ference on Computer Vision(ICCV), pp. 2326-2334
  • [8] MALIK J., Learning a Multi-View Stereo Machine[EB/OL], (2017)
  • [9] Large-scale data for multiple-view stereopsis [J], International Journal of Computer Vision, 120, 2, pp. 153-168, (2016)
  • [10] ZHOU Q Y,, Et al., Tanks and temples: benchmarking large-scale scene reconstruction[J], ACM Transactions on Graphics, 36, 4, pp. 1-13