Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

被引:173
|
作者
Le Guen, Vincent [1 ,2 ]
Thome, Nicolas [2 ]
机构
[1] EDF R&D, Chatou, France
[2] Conservatoire Natl Arts & Metiers, CEDRIC, Paris, France
关键词
D O I
10.1109/CVPR42600.2020.01149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting.
引用
收藏
页码:11471 / 11481
页数:11
相关论文
共 50 条
  • [21] Goal Detection from Unsupervised Video Surveillance
    Patel, Chirag I.
    Patel, Ripal
    Patel, Palak
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, 2011, 198 : 76 - +
  • [22] Unsupervised Learning of Event Classes from Video
    Sridhar, Muralikrishna
    Cohn, Anthony G.
    Hogg, David C.
    PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1631 - 1638
  • [23] Direct-from-Video: Unsupervised NRSfM
    Lebeda, Karel
    Hadfield, Simon
    Bowden, Richard
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 578 - 594
  • [24] Unsupervised Learning of Disentangled Representations from Video
    Denton, Emily
    Birodkar, Vighnesh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [25] Unsupervised Video Prediction Network with Spatio-temporal Deep Features
    Jin, Beibei
    Zhou, Rong
    Zhang, Zhisheng
    Dai, Min
    PROCEEDINGS OF THE 2018 25TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE (M2VIP), 2018, : 19 - 24
  • [26] Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction
    Kim, Yunji
    Nam, Seonghyeon
    Cho, In
    Kim, Seon Joo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [27] Disentangling collective trends from local dynamics
    Barthelemy, Marc
    Nadal, Jean-Pierre
    Berestycki, Henri
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (17) : 7629 - 7634
  • [28] Unsupervised Transfer Learning For Video Prediction Based on Generative Adversarial Network
    Shi, Jiwen
    Zhu, Qiuguo
    Wu, Jun
    2021 27TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE (M2VIP), 2021,
  • [29] Disentangling Style Factors from Speaker Representations
    Williams, Jennifer
    King, Simon
    INTERSPEECH 2019, 2019, : 3945 - 3949
  • [30] Harmony: A Generic Unsupervised Approach for Disentangling Semantic Content from Parameterized Transformations
    Uddin, Mostofa Rafid
    Howe, Gregory
    Zeng, Xiangrui
    Xu, Min
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20614 - 20623