Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation

被引:76
|
作者
Peng, Rui [1 ]
Wang, Rongjie [2 ]
Wang, Zhenyu [1 ]
Lai, Yawen [1 ]
Wang, Ronggang [1 ,2 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
SURFACE RECONSTRUCTION;
D O I
10.1109/CVPR52688.2022.00845
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depth estimation is solved as a regression or classification problem in existing learning-based multi-view stereo methods. Although these two representations have recently demonstrated their excellent performance, they still have apparent shortcomings, e.g., regression methods tend to overfit due to the indirect learning cost volume, and classification methods cannot directly infer the exact depth due to its discrete prediction. In this paper, we propose a novel representation, termed Unification, to unify the advantages of regression and classification. It can directly constrain the cost volume like classification methods, but also realize the sub-pixel depth prediction like regression methods. To excavate the potential of unification, we design a new loss function named Unified Focal Loss, which is more uniform and reasonable to combat the challenge of sample imbalance. Combining these two unburdened modules, we present a coarse-to-fine framework, that we call UniMVSNet. The results of ranking first on both DTU and Tanks and Temples benchmarks verify that our model not only performs the best but also has the best generalization ability.
引用
收藏
页码:8635 / 8644
页数:10
相关论文
共 50 条
  • [21] Multi-View Stereo and Depth Priors Guided NeRF for View Synthesis
    Deng, Wang
    Zhang, Xuetao
    Guo, Yu
    Lu, Zheng
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3922 - 3928
  • [22] Combining Multi-view Stereo and Super Resolution in a Unified Framework
    Park, Haesol
    Lee, Kyoung Mu
    Lee, Sang Uk
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [23] Robust Multi-view Representation: A Unified Perspective from Multi-view Learning to Domain Adaption
    Ding, Zhengming
    Shao, Ming
    Fu, Yun
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5434 - 5440
  • [24] Are Multi-view Edges Incomplete for Depth Estimation?
    Khan, Numair
    Kim, Min H.
    Tompkin, James
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (07) : 2639 - 2673
  • [25] Bundled Depth-Map Merging for Multi-View Stereo
    Li, Jianguo
    Li, Eric
    Chen, Yurong
    Xu, Lin
    Zhang, Yimin
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 2769 - 2776
  • [26] Recurrent Multi-view Stereo Depth Inference with Pyramid of Images
    Wang, Xiaobao
    Dong, Enzeng
    Tong, Jigang
    Sun, Zhe
    Li, Wenyu
    Duan, Feng
    PROCEEDINGS OF 2022 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2022), 2022, : 259 - 263
  • [27] Multi-View Stereo via Geometric Expansion and Depth Refinement
    Liu, Tao
    Yuan, Ding
    Zhao, Hongwei
    Yin, Jihao
    2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 555 - 560
  • [28] Feature-enhanced representation with transformers for multi-view stereo
    Xiang, Lintao
    Yin, Hujun
    IET IMAGE PROCESSING, 2024, 18 (06) : 1530 - 1539
  • [29] Multi-view video plus depth representation and coding
    Merkle, Philipp
    Smolic, Aljoscha
    Mueller, Karsten
    Wiegand, Thomas
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 201 - 204
  • [30] Rethinking probability volume for multi-view stereo: A probability analysis method
    Yu, Zonghua
    Wang, Huaijun
    Li, Junhuai
    Jin, Haiyan
    Cao, Ting
    Cheng, Kuanhong
    APPLIED INTELLIGENCE, 2025, 55 (06)