Holistic Prototype Attention Network for Few-Shot Video Object Segmentation

被引:8
|
作者
Tang, Yin [1 ]
Chen, Tao [1 ]
Jiang, Xiruo [1 ]
Yao, Yazhou [1 ]
Xie, Guo-Sen [1 ]
Shen, Heng-Tao [2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Prototypes; Task analysis; Object segmentation; Semantic segmentation; Semantics; Feature extraction; Annotations; Few-shot video object segmentation; video object segmentation; few-shot semantic segmentation;
D O I
10.1109/TCSVT.2023.3296629
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images that contain pixel-level object annotations. Existing methods have demonstrated that the domain agent-based attention mechanism is effective in FSVOS by learning the correlation between support images and query frames. However, the agent frame contains redundant pixel information and background noise, resulting in inferior segmentation performance. Moreover, existing methods tend to ignore inter-frame correlations in query videos. To alleviate the above dilemma, we propose a holistic prototype attention network (HPAN) for advancing FSVOS. Specifically, HPAN introduces a prototype graph attention module (PGAM) and a bidirectional prototype attention module (BPAM), transferring informative knowledge from seen to unseen classes. PGAM generates local prototypes from all foreground features and then utilizes their internal correlations to enhance the representation of the holistic prototypes. BPAM exploits the holistic information from support images and video frames by fusing co-attention and self-attention to achieve support-query semantic consistency and inner-frame temporal consistency. Extensive experiments on YouTube-FSVOS have been provided to demonstrate the effectiveness and superiority of our proposed HPAN method. Our source code and models are available anonymously at https://github.com/NUST-Machine-Intelligence-Laboratory/HPAN.
引用
收藏
页码:6699 / 6709
页数:11
相关论文
共 50 条
  • [1] Few-shot video object segmentation with prototype evolution
    Mao, Binjie
    Liu, Xiyan
    Shi, Linsu
    Yu, Jiazhong
    Li, Fei
    Xiang, Shiming
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5367 - 5382
  • [2] Few-shot video object segmentation with prototype evolution
    Binjie Mao
    Xiyan Liu
    Linsu Shi
    Jiazhong Yu
    Fei Li
    Shiming Xiang
    Neural Computing and Applications, 2024, 36 : 5367 - 5382
  • [3] Holistic Prototype Activation for Few-Shot Segmentation
    Cheng, Gong
    Lang, Chunbo
    Han, Junwei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4650 - 4666
  • [4] Intermediate prototype network for few-shot segmentation
    Luo, Xiaoliu
    Duan, Zhao
    Zhang, Taiping
    SIGNAL PROCESSING, 2023, 203
  • [5] Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation
    Liu, Nian
    Nan, Kepan
    Zhao, Wangbo
    Liu, Yuanwei
    Yao, Xiwen
    Khan, Salman
    Cholakkal, Hisham
    Anwer, Rao Muhammad
    Han, Junwei
    Khan, Fahad Shahbaz
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18816 - 18825
  • [6] Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation
    Chen, Haoxin
    Wu, Hanjie
    Zhao, Nanxuan
    Ren, Sucheng
    He, Shengfeng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14035 - 14044
  • [7] DUAL-ATTENTION NETWORK FOR FEW-SHOT SEGMENTATION
    Chen, Zhikui
    Wang, Han
    Zhang, Suhua
    Zhong, Fangming
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2210 - 2214
  • [8] CobNet: Cross Attention on Object and Background for Few-Shot Segmentation
    Guan, Haoyan
    Michael, Spratling
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 39 - 45
  • [9] Exploring the Better Correlation for Few-Shot Video Object Segmentation
    Luo, Naisong
    Wang, Yuan
    Sun, Rui
    Xiong, Guoxin
    Zhang, Tianzhu
    Wu, Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2133 - 2146
  • [10] Temporal Transductive Inference for Few-Shot Video Object Segmentation
    Siam, Mennatullah
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,