Holistic Prototype Attention Network for Few-Shot Video Object Segmentation

被引:8
|
作者
Tang, Yin [1 ]
Chen, Tao [1 ]
Jiang, Xiruo [1 ]
Yao, Yazhou [1 ]
Xie, Guo-Sen [1 ]
Shen, Heng-Tao [2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Prototypes; Task analysis; Object segmentation; Semantic segmentation; Semantics; Feature extraction; Annotations; Few-shot video object segmentation; video object segmentation; few-shot semantic segmentation;
D O I
10.1109/TCSVT.2023.3296629
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images that contain pixel-level object annotations. Existing methods have demonstrated that the domain agent-based attention mechanism is effective in FSVOS by learning the correlation between support images and query frames. However, the agent frame contains redundant pixel information and background noise, resulting in inferior segmentation performance. Moreover, existing methods tend to ignore inter-frame correlations in query videos. To alleviate the above dilemma, we propose a holistic prototype attention network (HPAN) for advancing FSVOS. Specifically, HPAN introduces a prototype graph attention module (PGAM) and a bidirectional prototype attention module (BPAM), transferring informative knowledge from seen to unseen classes. PGAM generates local prototypes from all foreground features and then utilizes their internal correlations to enhance the representation of the holistic prototypes. BPAM exploits the holistic information from support images and video frames by fusing co-attention and self-attention to achieve support-query semantic consistency and inner-frame temporal consistency. Extensive experiments on YouTube-FSVOS have been provided to demonstrate the effectiveness and superiority of our proposed HPAN method. Our source code and models are available anonymously at https://github.com/NUST-Machine-Intelligence-Laboratory/HPAN.
引用
收藏
页码:6699 / 6709
页数:11
相关论文
共 50 条
  • [21] CLIP-Driven Prototype Network for Few-Shot Semantic Segmentation
    Guo, Shi-Cheng
    Liu, Shang-Kun
    Wang, Jing-Yu
    Zheng, Wei-Min
    Jiang, Cheng-Yu
    ENTROPY, 2023, 25 (09)
  • [22] Selective Prototype Network for Few-Shot Metal Surface Defect Segmentation
    Yu, Ruiyun
    Guo, Bingyang
    Yang, Kang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [23] Self-Calibrated Cross Attention Network for Few-Shot Segmentation
    Xu, Qianxiong
    Zhao, Wenting
    Lin, Guosheng
    Long, Cheng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 655 - 665
  • [24] Pyramid Co-Attention Compare Network for Few-Shot Segmentation
    Zhang, Defu
    Luo, Ronghua
    Chen, Xuebin
    Chen, Lingwei
    IEEE ACCESS, 2021, 9 : 137249 - 137259
  • [25] Fast target-aware learning for few-shot video object segmentation
    Yadang CHEN
    Chuanyan HAO
    Zhi-Xin YANG
    Enhua WU
    ScienceChina(InformationSciences), 2022, 65 (08) : 71 - 86
  • [26] Fast target-aware learning for few-shot video object segmentation
    Chen, Yadang
    Hao, Chuanyan
    Yang, Zhi-Xin
    Wu, Enhua
    SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (08)
  • [27] Fast target-aware learning for few-shot video object segmentation
    Yadang Chen
    Chuanyan Hao
    Zhi-Xin Yang
    Enhua Wu
    Science China Information Sciences, 2022, 65
  • [28] σ-Adaptive Decoupled Prototype for Few-Shot Object Detection
    Du, Jinhao
    Zhang, Shan
    Chen, Qiang
    Le, Haifeng
    Sun, Yanpeng
    Ni, Yao
    Wang, Jian
    He, Bin
    Wang, Jingdong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18904 - 18914
  • [29] Dual-Guided Frequency Prototype Network for Few-Shot Semantic Segmentation
    Wen, Chunlin
    Huang, Hui
    Ma, Yan
    Yuan, Feiniu
    Zhu, Hongqing
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8874 - 8888
  • [30] Found missing semantics: Supplemental prototype network for few-shot semantic segmentation
    Liang, Chen
    Bai, Shuang
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249