Global Spectral Filter Memory Network for Video Object Segmentation

被引:20
|
作者
Liu, Yong [1 ,2 ]
Yu, Ran [1 ]
Wang, Jiahao [1 ]
Zhao, Xinyuan [3 ]
Wang, Yitong [2 ]
Tang, Yansong [1 ]
Yang, Yujiu [1 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] ByteDance Inc, Beijing, Peoples R China
[3] Northwestern Univ, Evanston, IL USA
来源
基金
中国国家自然科学基金;
关键词
Video object segmentation; Spectral domain;
D O I
10.1007/978-3-031-19818-2_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies semi-supervised video object segmentation through boosting intra-frame interaction. Recent memory network-based methods focus on exploiting inter-frame temporal reference while paying little attention to intra-frame spatial dependency. Specifically, these segmentation model tends to be susceptible to interference from unrelated nontarget objects in a certain frame. To this end, we propose Global Spectral Filter Memory network (GSFM), which improves intraframe interaction through learning long-term spatial dependencies in the spectral domain. The key components of GSFM is 2D (inverse) discrete Fourier transform for spatial information mixing. Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head). We attribute this to semantic information extracting role for encoder and fine-grained details highlighting role for decoder. Thus, Low (High) Frequency Module is proposed to fit this circumstance. Extensive experiments on the popular DAVIS and YouTube-VOS benchmarks demonstrate that GSFM noticeably outperforms the baseline method and achieves state-of-the-art performance. Besides, extensive analysis shows that the proposed modules are reasonable and of great generalization ability.
引用
收藏
页码:648 / 665
页数:18
相关论文
共 50 条
  • [31] Video object segmentation via couple streams and feature memory
    Liang, Yun
    Xiao, Xinjie
    Qiu, Shaojian
    Zhang, Yuqing
    Su, Zhuo
    IET IMAGE PROCESSING, 2024, 18 (09) : 2257 - 2272
  • [32] Local Memory Read-and-Comparator for Video Object Segmentation
    Heo, Yuk
    Koh, Yeong Jun
    Kim, Chang-Su
    IEEE ACCESS, 2022, 10 : 90004 - 90016
  • [33] Attention-Guided Memory Model for Video Object Segmentation
    Lin, Yunjian
    Tan, Yihua
    Communications in Computer and Information Science, 2022, 1566 CCIS : 67 - 85
  • [34] ASDeM: Augmenting SAM With Decoupled Memory for Video Object Segmentation
    Liu, Xiaohu
    Luo, Yichuang
    Sun, Wei
    IEEE ACCESS, 2024, 12 : 73218 - 73227
  • [35] Video object plane segmentation using a morphological motion filter and Hausdorff object tracking
    Meier, T
    Ngan, KN
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 2, 1998, : 652 - 656
  • [36] Temporally Consistent Referring Video Object Segmentation With Hybrid Memory
    Miao, Bo
    Bennamoun, Mohammed
    Gao, Yongsheng
    Shah, Mubarak
    Mian, Ajmal
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11373 - 11385
  • [37] Global Memory and Local Continuity for Video Object Detection
    Han, Liang
    Yin, Zhaozheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3681 - 3693
  • [38] Robust global motion estimation oriented to video object segmentation
    Qi, Bin
    Ghazal, Mohammed
    Amer, Aishy
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2008, 17 (06) : 958 - 967
  • [39] Video object segmentation based on global motion estimation/compensation
    Hsu, CT
    Tsai, YS
    Hsieh, MH
    Chien, YN
    ICCE: 2001 INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, DIGEST OF TECHNICAL PAPERS, 2001, : 168 - 169
  • [40] Video Object Segmentation Using Global and Instance Embedding Learning
    Ge, Wenbin
    Lu, Xiankai
    Shen, Jianbing
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16831 - 16840