Enforcing Temporal Consistency in Video Depth Estimation

被引:5
|
作者
Li, Siyuan [1 ]
Luo, Yue [1 ]
Zhu, Ye [1 ]
Zhao, Xun [1 ]
Li, Yu [1 ]
Shan, Ying [1 ]
机构
[1] Tencent PCG, Appl Res Ctr, Beijing, Peoples R China
关键词
D O I
10.1109/ICCVW54120.2021.00134
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most existing monocular depth estimation methods are trained on single images and have unsatisfactory temporal stability in video prediction. They may rely on post processing to solve this issue. A few video based depth estimation methods use reconstruction framework like structure-from-motion or sequential modeling. These methods have assumptions in the scenarios that they can apply thus limits their real applications. In this work, we present a simple approach for improving temporal consistency in video depth estimation. Specifically, we learn a prior from video data and this prior can be imposed directly into any single image monocular depth method. During testing, our method just performs end-to-end forward inference frame by frame without any sequential module or multi-frame module. In the meanwhile, we propose an evaluation metric that quantitatively measures temporal consistency of video depth predictions. It does not require labelled depth ground truths and only assesses flickering between consecutive frames. Experiments show our method can achieve improved temporal consistency in both standard benchmark and general cases without any post processing and extra computational cost. A subjective study indicates that our proposed metric is consistent with the visual perception of users, and our results with higher consistency scores are indeed preferred. These features make our method a practical video depth estimator to predict dense depth of real scenes and enable several video depth based applications.
引用
收藏
页码:1145 / 1154
页数:10
相关论文
共 50 条
  • [1] Enforcing Temporal Consistency for Color Constancy in Video Sequences
    Buzzelli, Marco
    Rota, Claudio
    Bianco, Simone
    Schettini, Raimondo
    COMPUTATIONAL COLOR IMAGING, CCIW 2024, 2025, 15193 : 274 - 288
  • [2] Exploiting temporal consistency for real-time video depth estimation
    Zhang, Haokui
    Shen, Chunhua
    Li, Ying
    Cao, Yuanzhouhan
    Liu, Yu
    Yan, Youliang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1725 - 1734
  • [3] DISPARITY SEARCH RANGE ESTIMATION: ENFORCING TEMPORAL CONSISTENCY
    Min, Dongbo
    Yea, Sehoon
    Arican, Zafer
    Vetro, Anthony
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2366 - 2369
  • [4] Spatio-temporal consistency in stereoscopic video depth map sequence estimation
    Duan, Fengfeng
    Wang, Yongbin
    Yang, Lifang
    Guan, Anqi
    Journal of Information and Computational Science, 2014, 11 (18): : 6497 - 6508
  • [5] Temporal Consistency Enhancement of Depth Video Sequence
    Lin, Shih-Hung
    Chung, Pau-Choo
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1896 - +
  • [6] Enforcing temporal consistency in real-time stereo estimation
    Gong, Minglun
    COMPUTER VISION - ECCV 2006, PT 3, PROCEEDINGS, 2006, 3953 : 564 - 577
  • [7] Spatio-Temporal Consistency in Depth Video Enhancement
    Li, Li
    Zhang, Caiming
    JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING, 2013, 7 (05): : 808 - 817
  • [8] Improved Depth Estimation Algorithm for Preserving Depth Edge and Temporal Consistency
    Yuan, Hui
    Chang, Yilin
    Lu, Zhaoyang
    Liu, Xiaoxian
    ICIEA 2010: PROCEEDINGS OF THE 5TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOL 3, 2010, : 519 - 522
  • [9] CONTENT-ADAPTIVE TEMPORAL CONSISTENCY ENHANCEMENT FOR DEPTH VIDEO
    Zeng, Huanqiang
    Ma, Kai-Kuang
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 3017 - 3020
  • [10] SPATIO-TEMPORAL CONSISTENCY IN VIDEO DISPARITY ESTIMATION
    Khoshabeh, Ramsin
    Chan, Stanley H.
    Nguyen, Truong Q.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 885 - 888