Decoupling Features in Hierarchical Propagation for Video Object Segmentation

被引:0
|
作者
Yang, Zongxin [1 ,2 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, CCAI, Hangzhou, Peoples R China
[2] Baidu Res, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on developing a more effective method of hierarchical propagation for semi-supervised Video Object Segmentation (VOS). Based on vision transformers, the recently-developed Associating Objects with Transformers (AOT) approach introduces hierarchical propagation into VOS and has shown promising results. The hierarchical propagation can gradually propagate information from past frames to the current frame and transfer the current frame feature from object-agnostic to object-specific. However, the increase of object-specific information will inevitably lead to the loss of object-agnostic visual information in deep propagation layers. To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach. Firstly, DeAOT decouples the hierarchical propagation of object-agnostic and object-specific embeddings by handling them in two independent branches. Secondly, to compensate for the additional computation from dual-branch propagation, we propose an efficient module for constructing hierarchical propagation, i.e., Gated Propagation Module, which is carefully designed with single-head attention. Extensive experiments show that DeAOT significantly outperforms AOT in both accuracy and efficiency. On YouTube-VOS, DeAOT can achieve 86.0% at 22.4fps and 82.0% at 53.4fps. Without test-time augmentations, we achieve new state-of-the-art performance on four benchmarks, i.e., YouTube-VOS (86.2%), DAVIS 2017 (86.2%), DAVIS 2016 (92.9%), and VOT 2020 (0.622). Project page: https://github.com/z-x-yang/AOT.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Click to Correction: Interactive Bidirectional Dynamic Propagation Video Object Segmentation Network
    Yang, Shuting
    Yuan, Xia
    Luo, Sihan
    SENSORS, 2024, 24 (19)
  • [42] Streaming Hierarchical Video Segmentation
    Xu, Chenliang
    Xiong, Caiming
    Corso, Jason J.
    COMPUTER VISION - ECCV 2012, PT VI, 2012, 7577 : 626 - 639
  • [43] Video Object of Interest Segmentation
    Zhou, Siyuan
    Zhan, Chunru
    Wang, Biao
    Ge, Tiezheng
    Jiang, Yuning
    Niu, Li
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3805 - 3813
  • [44] An Overview of Video Object Segmentation
    Zhu, Shiping
    Guo, Zhichao
    2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 1019 - 1021
  • [45] Gamifying Video Object Segmentation
    Spampinato, Concetto
    Palazzo, Simone
    Giordano, Daniela
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (10) : 1942 - 1958
  • [46] On guiding video object segmentation
    Ortego, Diego
    McGuinness, Kevin
    SanMiguel, Juan C.
    Arazo, Eric
    Martinez, Jose M.
    O'Connor, Noel E.
    2019 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2019,
  • [47] Video object clustering segmentation
    Lin, Q
    Zhang, X
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2840 - 2843
  • [48] Object segmentation for video coding
    Chen, LH
    Chen, JR
    Liao, HY
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS: IMAGE, SPEECH AND SIGNAL PROCESSING, 2000, : 383 - 386
  • [49] VIDEO OBJECT SEGMENTATION AGGREGATION
    Zhou, Tianfei
    Lu, Yao
    Di, Huijun
    Zhang, Jian
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [50] Video Object Segmentation: A Survey
    Sasithradevi, A.
    Roomi, S. Mohamed Mansoor
    Mareeswari, M.
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES), 2016, : 656 - 660