Segmenting Moving Objects via an Object-Centric Layered Representation

被引:0
|
作者
Xie, Junyu [1 ]
Xie, Weidi [1 ,2 ]
Zisserman, Andrew [1 ]
机构
[1] Univ Oxford, Dept Engn Sci, Visual Geometry Grp, Oxford, England
[2] Shanghai Jiao Tong Univ, Coop Medianet Innovat Ctr, Shanghai, Peoples R China
基金
英国工程与自然科学研究理事会;
关键词
SEGMENTATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The objective of this paper is a model that is able to discover, track and segment multiple moving objects in a video. We make four contributions: First, we introduce an object-centric segmentation model with a depth-ordered layer representation. This is implemented using a variant of the transformer architecture that ingests optical flow, where each query vector specifies an object and its layer for the entire video. The model can effectively discover multiple moving objects and handle mutual occlusions; Second, we introduce a scalable pipeline for generating multi-object synthetic training data via layer compositions, that is used to train the proposed model, significantly reducing the requirements for labour-intensive annotations, and supporting Sim2Real generalisation; Third, we conduct thorough ablation studies, showing that the model is able to learn object permanence and temporal shape consistency, and is able to predict amodal segmentation masks; Fourth, we evaluate our model, trained only on synthetic data, on standard video segmentation benchmarks, DAVIS, MoCA, SegTrack, FBMS-59, and achieve state-of-the-art performance among existing methods that do not rely on any manual annotations. With test-time adaptation, we observe further performance boosts.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] OCVOS: OBJECT-CENTRIC REPRESENTATION FOR VIDEO OBJECT SEGMENTATION
    Jo, Junho
    Wee, Dongyoon
    Cho, Nam Ik
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1655 - 1659
  • [2] Is an Object-Centric Video Representation Beneficial for Transfer?
    Zhang, Chuhan
    Gupta, Ankush
    Zisserman, Andrew
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 379 - 397
  • [3] Multi-Object Representation Learning via Feature Connectivity and Object-Centric Regularization
    Foo, Alex
    Hsu, Wynne
    Lee, Mong Li
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Object-Centric Representation Learning for Video Scene Understanding
    Zhou, Yi
    Zhang, Hui
    Park, Seung-In
    Yoo, ByungIn
    Qi, Xiaojuan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 8410 - 8423
  • [5] Language-Mediated, Object-Centric Representation Learning
    Wang, Ruocheng
    Mao, Jiayuan
    Gershman, Samuel J.
    Wu, Jiajun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2033 - 2046
  • [6] Object-Centric Representation Learning from Unlabeled Videos
    Gao, Ruohan
    Jayaraman, Dinesh
    Grauman, Kristen
    COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 248 - 263
  • [7] Object Synchronizations and Specializations with Silent Objects in Object-Centric Petri Nets
    van Detten, Jan Niklas
    Schumacher, Pol
    Leemans, Sander J. J.
    BUSINESS PROCESS MANAGEMENT, BPM 2024, 2024, 14940 : 57 - 74
  • [8] Object-Centric Representation Learning for Video Question Answering
    Long Hoang Dang
    Thao Minh Le
    Vuong Le
    Truyen Tran
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] Object-Centric Debugging
    Ressia, Jorge
    Bergel, Alexandre
    Nierstrasz, Oscar
    2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 485 - 495
  • [10] Object-Centric Multiple Object Tracking
    Zhao, Zixu
    Wang, Jiaze
    Horn, Max
    Ding, Yizhuo
    He, Tong
    Bai, Zechen
    Zietlow, Dominik
    Simon-Gabriel, Carl-Johann
    Shuai, Bing
    Tu, Zhuowen
    Brox, Thomas
    Schiele, Bernt
    Fu, Yanwei
    Locatello, Francesco
    Zhang, Zheng
    Xiao, Tianjun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16555 - 16565