Learning Depth from Monocular Videos using Direct Methods

被引:397
|
作者
Wang, Chaoyang [1 ]
Miguel Buenaposada, Jose [1 ,2 ]
Zhu, Rui [1 ]
Lucey, Simon [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Rey Juan Carlos, Mostoles, Spain
关键词
D O I
10.1109/CVPR.2018.00216
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community. Unsupervised strategies to learning are particularly appealing as they can utilize much larger and varied monocular video datasets during learning without the need for ground truth depth or stereo. In previous works, separate pose and depth CNN predictors had to be determined such that their joint outputs minimized the photometric error. Inspired by recent advances in direct visual odometry (DVO), we argue that the depth CNN predictor can be learned without a pose CNN predictor. Further, we demonstrate empirically that incorporation of a differentiable implementation of DVO, along with a novel depth normalization strategy - substantially improves performance over state of the art that use monocular videos for training.
引用
收藏
页码:2022 / 2030
页数:9
相关论文
共 50 条
  • [1] Unsupervised Learning of Monocular Depth from Videos
    Gao Haosheng
    Teng Wang
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 3945 - 3950
  • [2] Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
    Gordon, Ariel
    Li, Hanhan
    Jonschkowski, Rico
    Angelova, Anelia
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8976 - 8985
  • [3] Bags of tricks for learning depth and camera motion from monocular videos
    Dong B.
    Sheng L.
    Virtual Reality and Intelligent Hardware, 2019, 1 (05): : 500 - 510
  • [4] Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey
    Rajapaksha, Uchitha
    Sohel, Ferdous
    Laga, Hamid
    Diepeveen, Dean
    Bennamoun, Mohammed
    ACM COMPUTING SURVEYS, 2024, 56 (12)
  • [5] Online Depth Learning against Forgetting in Monocular Videos
    Zhang, Zhenyu
    Lathuiliere, Stephane
    Ricci, Elisa
    Sebe, Nicu
    Yan, Yan
    Yang, Jian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4493 - 4502
  • [6] Spatial Correspondence with Generative Adversarial Network: Learning Depth from Monocular Videos
    Wu, Zhenyao
    Wu, Xinyi
    Zhang, Xiaoping
    Wang, Song
    Ju, Lili
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7493 - 7503
  • [7] Depth Prediction without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
    Casser, Vincent
    Pirk, Soeren
    Mahjourian, Reza
    Angelova, Anelia
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8001 - 8008
  • [8] Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos
    Zhao, Haimei
    Bian, Wei
    Yuan, Bo
    Tao, Dacheng
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 488 - 494
  • [9] Unsupervised Learning of Depth from Monocular Videos Using 3D-2D Corresponding Constraints
    Jin, Fusheng
    Zhao, Yu
    Wan, Chuanbing
    Yuan, Ye
    Wang, Shuliang
    REMOTE SENSING, 2021, 13 (09)
  • [10] Monocular Depth Estimation for Equirectangular Videos
    Fraser, Helmi
    Wang, Sen
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 5293 - 5299