Unsupervised 3D reconstruction method based on multi-view propagation

被引:0
|
作者
Luo J. [1 ]
Yuan D. [1 ]
Zhang L. [2 ]
Qu Y. [1 ]
Su S. [1 ]
机构
[1] School of Automation, Northwestern Polytechnical University, Xi'an
[2] School of Cyber Engineering, Xidian University, Xi'an
关键词
3D reconstruction; multi-metric loss function; multi-view propagation; Patchmatch algorithm; unsupervised;
D O I
10.1051/jnwpu/20244210129
中图分类号
学科分类号
摘要
In this paper, an end-to-end deep learning framework for reconstructing 3D models by computing depth maps from multiple views is proposed. An unsupervised 3D reconstruction method based on multi-view propagation is introduced, which addresses the issues of large GPU memory consumption caused by most current research methods using 3D convolution for 3D cost volume regularization and regression to obtain the initial depth map, as well as the difficulty in obtaining true depth values in supervised methods due to device limitations. The method is inspired by the Patchmatch algorithm, and the depth is divided into n layers within the depth range to obtain depth hypotheses through multi-view propagation. What's more, a multi-metric loss function is constructed based on luminosity consistency, structural similarity, and depth smoothness between multiple views to serve as a supervisory signal for learning depth predictions in the network. The experimental results show our proposed method has a very competitive performance and generalization on the DTU, Tanks & Temples and our self-made dataset; Specifically, it is at least 1.7 times faster and requires more than 75% less memory than the method that utilizes 3D cost volume regularization. ©2024 Journal of Northwestern Polytechnical University.
引用
收藏
页码:129 / 137
页数:8
相关论文
共 24 条
  • [1] YAO Y, LUO Z X, LI S W, Et al., MVSNet: depth inference for unstructured multi-view stereo, 15th European Conference on Computer Vision, pp. 785-801, (2018)
  • [2] GALLUP D, FRAHM J M, MORDOHAI P, Et al., Real-time plane-sweeping stereo with multiple sweeping directions, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, (2007)
  • [3] YAO Y, LUO Z X, LI S W, Et al., Recurrent MVSNet for high-resolution multi-view stereo depth inference, 32nd IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 5520-5529, (2019)
  • [4] YANG J, MAO W, ALVAREZ J, Et al., Cost volume pyramid based depth inference for multi-view stereo, IEEE Trans on Pattern Analysis and Machine Intelligence, 44, 9, pp. 4748-4760, (2022)
  • [5] GALLIANI S, LASINGER K, SCHINDLER K., Massively parallel multiview stereopsis by surface normal diffusion, 2015 IEEE International Conference on Computer Vision, pp. 873-881, (2015)
  • [6] SCHONBERGER J L, FRAHM J M., Structure-from-motion revisited, 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104-4113, (2016)
  • [7] XU Q, TAO W., Multi-scale geometric consistency guided multi-view stereo, 2019 IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 5478-5487, (2019)
  • [8] BARNES C, SHECHTMAN E, FINKELSTEIN A, Et al., PatchMatch: a randomized correspondence algorithm for structural image editing, ACM Transactions on Graphics, 28, 3, (2009)
  • [9] BLEYER M, RHEMANN C, ROTHER C., PatchMatch stereo-stereo matching with slanted support windows, British Machine Vision Conference, (2011)
  • [10] WANG F, GALLIANI S, VOGEL C, Et al., PatchmatchNet: learned multi-view patchmatch stereo, IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 14189-14198, (2021)