Learning Depth from Monocular Videos using Direct Methods

被引:397
|
作者
Wang, Chaoyang [1 ]
Miguel Buenaposada, Jose [1 ,2 ]
Zhu, Rui [1 ]
Lucey, Simon [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Univ Rey Juan Carlos, Mostoles, Spain
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
关键词
D O I
10.1109/CVPR.2018.00216
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community. Unsupervised strategies to learning are particularly appealing as they can utilize much larger and varied monocular video datasets during learning without the need for ground truth depth or stereo. In previous works, separate pose and depth CNN predictors had to be determined such that their joint outputs minimized the photometric error. Inspired by recent advances in direct visual odometry (DVO), we argue that the depth CNN predictor can be learned without a pose CNN predictor. Further, we demonstrate empirically that incorporation of a differentiable implementation of DVO, along with a novel depth normalization strategy - substantially improves performance over state of the art that use monocular videos for training.
引用
收藏
页码:2022 / 2030
页数:9
相关论文
共 50 条
  • [41] A new Evaluation Approach for Deep Learning-based Monocular Depth Estimation Methods
    Mauri, Antoine
    Khemmar, Redouane
    Boutteau, Remi
    Decoux, Benoit
    Ertaud, Jean-Yves
    Haddad, Madjid
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [42] Real-time segmentation methods for monocular soccer videos
    Hoernig M.
    Herrmann M.
    Radig B.
    Pattern Recognition and Image Analysis, 2015, 25 (02) : 327 - 337
  • [43] Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
    Zhang, Mingliang
    Ye, Xinchen
    Fan, Xin
    Zhong, Wei
    NEUROCOMPUTING, 2020, 379 (379) : 250 - 261
  • [44] Unique people count from monocular videos
    Satarupa Mukherjee
    Stephani Gil
    Nilanjan Ray
    The Visual Computer, 2015, 31 : 1405 - 1417
  • [45] Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
    Bai, Ziqian
    Tan, Feitong
    Huang, Zeng
    Sarkar, Kripasindhu
    Tang, Danhang
    Qiu, Di
    Meka, Abhimitra
    Du, Ruofei
    Dou, Mingsong
    Orts-Escolano, Sergio
    Pandey, Rohit
    Tan, Ping
    Beeler, Thabo
    Fanello, Sean
    Zhang, Yinda
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16890 - 16900
  • [46] ROBUST LEARNING FOR DEEP MONOCULAR DEPTH ESTIMATION
    Irie, Go
    Kawanishi, Takahito
    Kashino, Kunio
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 964 - 968
  • [47] Monocular Depth Estimation Based on Unsupervised Learning
    Liu, Wan
    Sun, Yan
    Wang, XuCheng
    Yang, Lin
    Zheng, Zhenrong
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VI, 2019, 11187
  • [48] Tracking human arm from monocular videos
    Yue, HongQiang
    Li, ChengRong
    Liang, YiXiong
    Luo, YangYu
    2007 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS I-V, CONFERENCE PROCEEDINGS, 2007, : 2155 - 2159
  • [49] Deep learning for monocular depth estimation: A review
    Ming, Yue
    Meng, Xuyang
    Fan, Chunxiao
    Yu, Hui
    NEUROCOMPUTING, 2021, 438 : 14 - 33
  • [50] Unique people count from monocular videos
    Mukherjee, Satarupa
    Gil, Stephani
    Ray, Nilanjan
    VISUAL COMPUTER, 2015, 31 (10): : 1405 - 1417