OBJECT-CENTRIC VIDEO PREDICTION VIA DECOUPLING OF OBJECT DYNAMICS AND INTERACTIONS

被引:1
|
作者
Villar-Corrales, Angel [1 ]
Wahdan, Ismail [1 ]
Behnke, Sven [1 ]
机构
[1] Univ Bonn, Autonomous Intelligent Syst, Bonn, Germany
关键词
Object-centric video prediction; scene parsing; object-centric learning; future frame prediction; transformers;
D O I
10.1109/ICIP49359.2023.10222810
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a framework for object-centric video prediction, i.e., parsing a video sequence into objects, and modeling their dynamics and interactions in order to predict the future object states from which video frames are rendered. To facilitate the learning of meaningful spatio-temporal object representations and forecasting of their states, we propose two novel object-centric video prediction (OCVP) transformer modules, which decouple the processing of temporal dynamics and object interactions. We show how OCVP predictors outperform object-agnostic video prediction models on two different datasets. Furthermore, we observe that OCVP modules learn consistent and interpretable object representations. Animations and code to reproduce our results can be found in our project website(1).
引用
收藏
页码:570 / 574
页数:5
相关论文
共 50 条
  • [1] Object-centric Video Prediction without Annotation
    Schmeckpeper, Karl
    Georgakis, Georgios
    Daniilidis, Kostas
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13604 - 13610
  • [2] Learning Object-Centric Transformation for Video Prediction
    Chen, Xiongtao
    Wang, Wenmin
    Wang, Jinzhuo
    Li, Weimian
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1503 - 1511
  • [3] OCVOS: OBJECT-CENTRIC REPRESENTATION FOR VIDEO OBJECT SEGMENTATION
    Jo, Junho
    Wee, Dongyoon
    Cho, Nam Ik
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1655 - 1659
  • [4] Uni-and-Bi-Directional Video Prediction via Learning Object-Centric Transformation
    Chen, Xiongtao
    Wang, Wenmin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (06) : 1591 - 1604
  • [5] Is an Object-Centric Video Representation Beneficial for Transfer?
    Zhang, Chuhan
    Gupta, Ankush
    Zisserman, Andrew
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 379 - 397
  • [6] Object-Centric Diffusion for Efficient Video Editing
    Kahatapitiya, Kumara
    Karjauv, Adil
    Abati, Davide
    Porikli, Fatih
    Asano, Yuki M.
    Habibian, Amirhossein
    COMPUTER VISION-ECCV 2024, PT LVII, 2025, 15115 : 91 - 108
  • [7] Object-Centric Multiple Object Tracking
    Zhao, Zixu
    Wang, Jiaze
    Horn, Max
    Ding, Yizhuo
    He, Tong
    Bai, Zechen
    Zietlow, Dominik
    Simon-Gabriel, Carl-Johann
    Shuai, Bing
    Tu, Zhuowen
    Brox, Thomas
    Schiele, Bernt
    Fu, Yanwei
    Locatello, Francesco
    Zhang, Zheng
    Xiao, Tianjun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16555 - 16565
  • [8] Object-Centric Debugging
    Ressia, Jorge
    Bergel, Alexandre
    Nierstrasz, Oscar
    2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 485 - 495
  • [9] Object-Centric Representation Learning for Video Scene Understanding
    Zhou, Yi
    Zhang, Hui
    Park, Seung-In
    Yoo, ByungIn
    Qi, Xiaojuan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 8410 - 8423
  • [10] InstMove: Instance Motion for Object-centric Video Segmentation
    Liu, Qihao
    Wu, Junfeng
    Jiang, Yi
    Bai, Xiang
    Yuille, Alan
    Bai, Song
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6344 - 6354