Decoupling Features in Hierarchical Propagation for Video Object Segmentation

被引:0
|
作者
Yang, Zongxin [1 ,2 ]
Yang, Yi [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, CCAI, Hangzhou, Peoples R China
[2] Baidu Res, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on developing a more effective method of hierarchical propagation for semi-supervised Video Object Segmentation (VOS). Based on vision transformers, the recently-developed Associating Objects with Transformers (AOT) approach introduces hierarchical propagation into VOS and has shown promising results. The hierarchical propagation can gradually propagate information from past frames to the current frame and transfer the current frame feature from object-agnostic to object-specific. However, the increase of object-specific information will inevitably lead to the loss of object-agnostic visual information in deep propagation layers. To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach. Firstly, DeAOT decouples the hierarchical propagation of object-agnostic and object-specific embeddings by handling them in two independent branches. Secondly, to compensate for the additional computation from dual-branch propagation, we propose an efficient module for constructing hierarchical propagation, i.e., Gated Propagation Module, which is carefully designed with single-head attention. Extensive experiments show that DeAOT significantly outperforms AOT in both accuracy and efficiency. On YouTube-VOS, DeAOT can achieve 86.0% at 22.4fps and 82.0% at 53.4fps. Without test-time augmentations, we achieve new state-of-the-art performance on four benchmarks, i.e., YouTube-VOS (86.2%), DAVIS 2017 (86.2%), DAVIS 2016 (92.9%), and VOT 2020 (0.622). Project page: https://github.com/z-x-yang/AOT.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Efficient frame-sequential label propagation for video object segmentation
    Chen, Yadang
    Hao, Chuanyan
    Wu, Wen
    Wu, Enhua
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (05) : 6117 - 6133
  • [32] MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation
    Xu, Shuangjie
    Liu, Daizong
    Bao, Linchao
    Liu, Wei
    Zhou, Pan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 314 - 323
  • [33] Improving Unsupervised Label Propagation for Pose Tracking and Video Object Segmentation
    Waldmann, Urs
    Bamberger, Jannik
    Johannsen, Ole
    Deussen, Oliver
    Goldlucke, Bastian
    PATTERN RECOGNITION, DAGM GCPR 2022, 2022, 13485 : 230 - 245
  • [34] Breaking the "Object" in Video Object Segmentation
    Tokmakov, Pavel
    Li, Jie
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22836 - 22845
  • [35] ROBUST AUTOMATIC VIDEO OBJECT SEGMENTATION WITH GRAPHCUT ASSISTED BY SURF FEATURES
    Kudo, Satomi
    Koga, Hisashi
    Yokoyama, Takanori
    Watanabe, Toshinori
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 297 - 300
  • [36] Temporal segmentation of video objects for hierarchical object-based motion description
    Fu, Y
    Ekin, A
    Tekalp, AM
    Mehrotra, R
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2002, 11 (02) : 135 - 145
  • [37] Unsupervised video object segmentation and tracking based on new edge features
    Kim, BG
    Park, DJ
    PATTERN RECOGNITION LETTERS, 2004, 25 (15) : 1731 - 1742
  • [38] Hierarchical Graph Pattern Understanding for Zero-Shot Video Object Segmentation
    Pei, Gensheng
    Shen, Fumin
    Yao, Yazhou
    Chen, Tao
    Hua, Xian-Sheng
    Shen, Heng-Tao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5909 - 5920
  • [39] Streaming graph-based hierarchical video segmentation by a simple label propagation
    de Souza, Kleber J. F.
    Araujo, Arnaldo de A.
    Guimaraes, Silvio J. F.
    do Patrocinio, Zenilton K. G., Jr.
    Cord, Matthieu
    2015 28TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES, 2015, : 119 - 125
  • [40] Distance-Guided Mask Propagation Model for Efficient Video Object Segmentation
    Liu, Jiajia
    Dai, Hongning
    Li, Bo
    Tang, Gaozhong
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,