Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

被引:173
|
作者
Le Guen, Vincent [1 ,2 ]
Thome, Nicolas [2 ]
机构
[1] EDF R&D, Chatou, France
[2] Conservatoire Natl Arts & Metiers, CEDRIC, Paris, France
关键词
D O I
10.1109/CVPR42600.2020.01149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods. Since physics is too restrictive for describing the full visual content of generic videos, we introduce PhyDNet, a two-branch deep architecture, which explicitly disentangles PDE dynamics from unknown complementary information. A second contribution is to propose a new recurrent physical cell (PhyCell), inspired from data assimilation techniques, for performing PDE-constrained prediction in latent space. Extensive experiments conducted on four various datasets show the ability of PhyDNet to outperform state-of-the-art methods. Ablation studies also highlight the important gain brought out by both disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet presents interesting features for dealing with missing data and long-term forecasting.
引用
收藏
页码:11471 / 11481
页数:11
相关论文
共 50 条
  • [41] Learning Semantic-Aware Dynamics for Video Prediction
    Bei, Xinzhu
    Yang, Yanchao
    Soatto, Stefano
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 902 - 912
  • [42] Unsupervised Learning of Visual Representations via Rotation and Future Frame Prediction for Video Retrieval
    Kumar, Vidit
    Tripathi, Vikas
    Pant, Bhaskar
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 701 - 710
  • [43] Prediction of unknown terms of a sequence and its application to some physical problems
    Roy, Dhiranjan
    Bhattacharya, Ranjan
    ANNALS OF PHYSICS, 2006, 321 (06) : 1483 - 1523
  • [44] Unsupervised Task Graph Generation from Instructional Video Transcripts
    Logeswaran, Lajanugen
    Sohn, Sungryull
    Jang, Yunseok
    Lee, Moontae
    Lee, Honglak
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3392 - 3406
  • [45] Unsupervised learning of multiple aspects of moving objects from video
    Titsias, MK
    Williams, CKI
    ADVANCES IN INFORMATICS, PROCEEDINGS, 2005, 3746 : 746 - 756
  • [46] UNSUPERVISED INCREMENTAL LEARNING OF DEEP DESCRIPTORS FROM VIDEO STREAMS
    Pernici, Federico
    Del Bimbo, Alberto
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [47] Unsupervised Scale-Consistent Depth Learning from Video
    Jia-Wang Bian
    Huangying Zhan
    Naiyan Wang
    Zhichao Li
    Le Zhang
    Chunhua Shen
    Ming-Ming Cheng
    Ian Reid
    International Journal of Computer Vision, 2021, 129 : 2548 - 2564
  • [48] Unsupervised Learning of Depth and Ego-Motion from Video
    Zhou, Tinghui
    Brown, Matthew
    Snavely, Noah
    Lowe, David G.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6612 - +
  • [49] Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
    Gordon, Ariel
    Li, Hanhan
    Jonschkowski, Rico
    Angelova, Anelia
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8976 - 8985
  • [50] Unsupervised Learning of Event AND-OR Grammar and Semantics from Video
    Si, Zhangzhang
    Pei, Mingtao
    Yao, Benjamin
    Zhu, Song-Chun
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 41 - 48