Learning Depth from Monocular Videos using Direct Methods

被引：397

作者：

Wang, Chaoyang ^{[1
]}

Miguel Buenaposada, Jose ^{[1
,2
]}

Zhu, Rui ^{[1
]}

Lucey, Simon ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Univ Rey Juan Carlos, Mostoles, Spain

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

D O I：

10.1109/CVPR.2018.00216

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community. Unsupervised strategies to learning are particularly appealing as they can utilize much larger and varied monocular video datasets during learning without the need for ground truth depth or stereo. In previous works, separate pose and depth CNN predictors had to be determined such that their joint outputs minimized the photometric error. Inspired by recent advances in direct visual odometry (DVO), we argue that the depth CNN predictor can be learned without a pose CNN predictor. Further, we demonstrate empirically that incorporation of a differentiable implementation of DVO, along with a novel depth normalization strategy - substantially improves performance over state of the art that use monocular videos for training.

引用

页码：2022 / 2030

页数：9

共 50 条

[41] A new Evaluation Approach for Deep Learning-based Monocular Depth Estimation Methods
Mauri, Antoine
Khemmar, Redouane
Boutteau, Remi
Decoux, Benoit
Ertaud, Jean-Yves
Haddad, Madjid
2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
[42] Real-time segmentation methods for monocular soccer videos
Hoernig M.
Herrmann M.
Radig B.
Pattern Recognition and Image Analysis, 2015, 25 (02) : 327 - 337
[43] Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
Zhang, Mingliang
Ye, Xinchen
Fan, Xin
Zhong, Wei
NEUROCOMPUTING, 2020, 379 (379) : 250 - 261
[44] Unique people count from monocular videos
Satarupa Mukherjee
Stephani Gil
Nilanjan Ray
The Visual Computer, 2015, 31 : 1405 - 1417
[45] Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
Bai, Ziqian
Tan, Feitong
Huang, Zeng
Sarkar, Kripasindhu
Tang, Danhang
Qiu, Di
Meka, Abhimitra
Du, Ruofei
Dou, Mingsong
Orts-Escolano, Sergio
Pandey, Rohit
Tan, Ping
Beeler, Thabo
Fanello, Sean
Zhang, Yinda
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16890 - 16900
[46] ROBUST LEARNING FOR DEEP MONOCULAR DEPTH ESTIMATION
Irie, Go
Kawanishi, Takahito
Kashino, Kunio
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 964 - 968
[47] Monocular Depth Estimation Based on Unsupervised Learning
Liu, Wan
Sun, Yan
Wang, XuCheng
Yang, Lin
Zheng, Zhenrong
OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VI, 2019, 11187
[48] Tracking human arm from monocular videos
Yue, HongQiang
Li, ChengRong
Liang, YiXiong
Luo, YangYu
2007 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS I-V, CONFERENCE PROCEEDINGS, 2007, : 2155 - 2159
[49] Deep learning for monocular depth estimation: A review
Ming, Yue
Meng, Xuyang
Fan, Chunxiao
Yu, Hui
NEUROCOMPUTING, 2021, 438 : 14 - 33
[50] Unique people count from monocular videos
Mukherjee, Satarupa
Gil, Stephani
Ray, Nilanjan
VISUAL COMPUTER, 2015, 31 (10): : 1405 - 1417

← 1 2 3 4 5 →