Exploring the Better Correlation for Few-Shot Video Object Segmentation

被引:0
|
作者
Luo, Naisong [1 ]
Wang, Yuan [1 ]
Sun, Rui [1 ]
Xiong, Guoxin [1 ]
Zhang, Tianzhu [1 ,2 ]
Wu, Feng [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci, Hefei 230027, Peoples R China
[2] Deep Space Explorat Lab, Hefei 230088, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot video object segmentation; video object segmentation; few-shot learning;
D O I
10.1109/TCSVT.2024.3491214
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Few-shot video object segmentation (FSVOS) aims to achieve accurate segmentation of novel objects in given video sequences, where the target objects are specified by limited annotated images as support. Most previous top-performing methods adopt the support-query semantic correlation learning paradigm or the intra-query temporal correlation learning paradigm. Nevertheless, they either fail to model temporal consistency across frames, resulting in inconsecutive segmentation, or lose diverse support object information, leading to incomplete segmentation. Therefore, we argue that it is more desirable to achieve both correlations in a collaborative manner. In this work, we delve into the issues present in the combination of few-shot image segmentation methods and video object segmentation methods and propose a dedicated Collaborative Correlation Network (CoCoNet) to address these problems, including a pixel correlation calibration module and a temporal correlation mining module. The proposed CoCoNet enjoys several merits. First, the pixel correlation calibration module aims to mitigate the noise issue in support-query correlation by integrating the affinity learning strategy and the prototype learning strategy. Specifically, we employ Optimal Transport to enrich pixel correlation with contextual information, thereby reducing intra-class differences between support and query. Second, the temporal correlation mining module is responsible for alleviating the issue of uncertainty in the initial frame and establishing reliable guidance for subsequent frames of the query video. With the collaboration of these two modules, our CoCoNet can effectively establish support-query and temporal correlation simultaneously and achieve accurate FSVOS. Extensive experimental results on two challenging benchmarks demonstrate that our method performs favorably against state-of-the-art FSVOS methods.
引用
收藏
页码:2133 / 2146
页数:14
相关论文
共 50 条
  • [1] Few-shot video object segmentation with prototype evolution
    Mao, Binjie
    Liu, Xiyan
    Shi, Linsu
    Yu, Jiazhong
    Li, Fei
    Xiang, Shiming
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (10): : 5367 - 5382
  • [2] Few-shot video object segmentation with prototype evolution
    Binjie Mao
    Xiyan Liu
    Linsu Shi
    Jiazhong Yu
    Fei Li
    Shiming Xiang
    Neural Computing and Applications, 2024, 36 : 5367 - 5382
  • [3] Temporal Transductive Inference for Few-Shot Video Object Segmentation
    Siam, Mennatullah
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [4] Few-Shot Video Object Detection
    Fan, Qi
    Tang, Chi-Keung
    Tai, Yu-Wing
    COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 76 - 98
  • [5] Holistic Prototype Attention Network for Few-Shot Video Object Segmentation
    Tang, Yin
    Chen, Tao
    Jiang, Xiruo
    Yao, Yazhou
    Xie, Guo-Sen
    Shen, Heng-Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 6699 - 6709
  • [6] Fast target-aware learning for few-shot video object segmentation
    Yadang CHEN
    Chuanyan HAO
    Zhi-Xin YANG
    Enhua WU
    ScienceChina(InformationSciences), 2022, 65 (08) : 71 - 86
  • [7] Fast target-aware learning for few-shot video object segmentation
    Chen, Yadang
    Hao, Chuanyan
    Yang, Zhi-Xin
    Wu, Enhua
    SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (08)
  • [8] Fast target-aware learning for few-shot video object segmentation
    Yadang Chen
    Chuanyan Hao
    Zhi-Xin Yang
    Enhua Wu
    Science China Information Sciences, 2022, 65
  • [9] Exploring Hierarchical Prototypes for Few-Shot Segmentation
    Chen, Yaozong
    Cao, Wenming
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 42 - 53
  • [10] Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation
    Chen, Haoxin
    Wu, Hanjie
    Zhao, Nanxuan
    Ren, Sucheng
    He, Shengfeng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14035 - 14044