Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

被引:173
|
作者
Le Guen, Vincent [1 ,2 ]
Thome, Nicolas [2 ]
机构
[1] EDF R&D, Chatou, France
[2] Conservatoire Natl Arts & Metiers, CEDRIC, Paris, France
关键词
D O I
10.1109/CVPR42600.2020.01149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting.
引用
收藏
页码:11471 / 11481
页数:11
相关论文
共 50 条
  • [1] Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction
    Wu, Xinheng
    Lu, Jie
    Yan, Zheng
    Zhang, Guangquan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 15
  • [2] Unsupervised Learning for Physical Interaction through Video Prediction
    Finn, Chelsea
    Goodfellow, Ian
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [3] Disentangling Propagation and Generation for Video Prediction
    Gao, Hang
    Xu, Huazhe
    Cai, Qi-Zhi
    Wang, Ruth
    Yu, Fisher
    Darrell, Trevor
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9005 - 9014
  • [4] Learning to Disentangle Latent Physical Factors for Video Prediction
    Zhu, Deyao
    Munderloh, Marco
    Rosenhahn, Bodo
    Stueckle, Joerg
    PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 : 595 - 608
  • [5] Performance Prediction for Unsupervised Video Indexing
    Ewerth, Ralph
    Freisleben, Bernd
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2009, 5702 : 1036 - 1043
  • [6] Unsupervised Hierarchical Disentanglement for Video Prediction
    Motallebi, Mohammad Reza
    Westfechtel, Thomas
    Li, Yang
    Harada, Tatsuya
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 470 - 476
  • [7] Unsupervised framework for depth estimation and camera motion prediction from video
    Yang, Delong
    Zhong, Xunyu
    Gu, Dongbing
    Peng, Xiafu
    Hu, Huosheng
    NEUROCOMPUTING, 2020, 385 (385) : 169 - 185
  • [8] Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control
    Zhong, Yaofeng Desmond
    Leonard, Naomi Ehrich
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [9] VarNet: Exploring Variations for Unsupervised Video Prediction
    Jin, Beibei
    Hu, Yu
    Zeng, Yiming
    Tang, Qiankun
    Liu, Shice
    Ye, Jing
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 5801 - 5806
  • [10] Disentangling from babylonian confusion - Unsupervised language identification
    Biemann, C
    Teresniak, S
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 773 - 784