On Robust Cross-view Consistency in Self-supervised Monocular Depth Estimation

被引:0
|
作者
Haimei Zhao
Jing Zhang
Zhuo Chen
Bo Yuan
Dacheng Tao
机构
[1] University of Sydney,School of Computer Science
[2] Tsinghua University,Shenzhen International Graduate School
[3] University of Queensland,School of Information Technology & Electrical Engineering
来源
关键词
3D vision; depth estimation; cross-view consistency; self-supervised learning; monocular perception;
D O I
暂无
中图分类号
学科分类号
摘要
Remarkable progress has been made in self-supervised monocular depth estimation (SS-MDE) by exploring cross-view consistency, e.g., photometric consistency and 3D point cloud consistency. However, they are very vulnerable to illumination variance, occlusions, texture-less regions, as well as moving objects, making them not robust enough to deal with various scenes. To address this challenge, we study two kinds of robust cross-view consistency in this paper. Firstly, the spatial offset field between adjacent frames is obtained by reconstructing the reference frame from its neighbors via deformable alignment, which is used to align the temporal depth features via a depth feature alignment (DFA) loss. Secondly, the 3D point clouds of each reference frame and its nearby frames are calculated and transformed into voxel space, where the point density in each voxel is calculated and aligned via a voxel density alignment (VDA) loss. In this way, we exploit the temporal coherence in both depth feature space and 3D voxel space for SS-MDE, shifting the “point-to-point” alignment paradigm to the “region-to-region” one. Compared with the photometric consistency loss as well as the rigid point cloud alignment loss, the proposed DFA and VDA losses are more robust owing to the strong representation power of deep features as well as the high tolerance of voxel density to the aforementioned challenges. Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques. Extensive ablation study and analysis validate the effectiveness of the proposed losses, especially in challenging scenes. The code and models are available at https://github.com/sunnyHelen/RCVC-depth.
引用
收藏
页码:495 / 513
页数:18
相关论文
共 50 条
  • [21] Self-supervised monocular depth estimation for gastrointestinal endoscopy
    Liu, Yuying
    Zuo, Siyang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 238
  • [22] Self-supervised monocular depth estimation with direct methods
    Wang, Haixia
    Sun, Yehao
    Wu, Q. M. Jonathan
    Lu, Xiao
    Wang, Xiuling
    Zhang, Zhiguo
    NEUROCOMPUTING, 2021, 421 : 340 - 348
  • [23] Self-supervised monocular depth estimation with direct methods
    Wang H.
    Sun Y.
    Wu Q.M.J.
    Lu X.
    Wang X.
    Zhang Z.
    Neurocomputing, 2021, 421 : 340 - 348
  • [24] Adaptive Self-supervised Depth Estimation in Monocular Videos
    Mendoza, Julio
    Pedrini, Helio
    IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 687 - 699
  • [25] Self-Supervised Monocular Depth Estimation With Extensive Pretraining
    Choi, Hyukdoo
    IEEE ACCESS, 2021, 9 : 157236 - 157246
  • [26] Self-Supervised Monocular Depth Estimation with Extensive Pretraining
    Choi, Hyukdoo
    IEEE Access, 2021, 9 : 157236 - 157246
  • [27] Self-supervised Depth Estimation from Spectral Consistency and Novel View Synthesis
    Lu, Yawen
    Lu, Guoyu
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [28] Enhanced blur-robust monocular depth estimation via self-supervised learning
    Sung, Chi-Hun
    Kim, Seong-Yeol
    Shin, Ho-Ju
    Lee, Se-Ho
    Kim, Seung-Wook
    ELECTRONICS LETTERS, 2024, 60 (22)
  • [29] An Efficient Self-Supervised Cross-View Training For Sentence Embedding
    Limkonchotiwat, Peerat
    Ponwitayarat, Wuttikorn
    Lowphansirikul, Lalita
    Udomcharoenchaikit, Can
    Chuangsuwanich, Ekapol
    Nutanong, Sarana
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 1572 - 1587
  • [30] Learning Where to Learn in Cross-View Self-Supervised Learning
    Huang, Lang
    You, Shan
    Zheng, Mingkai
    Wang, Fei
    Qian, Chen
    Yamasaki, Toshihiko
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14431 - 14440