Global Spectral Filter Memory Network for Video Object Segmentation

被引：20

作者：

Liu, Yong ^{[1
,2
]}

Yu, Ran ^{[1
]}

Wang, Jiahao ^{[1
]}

Zhao, Xinyuan ^{[3
]}

Wang, Yitong ^{[2
]}

Tang, Yansong ^{[1
]}

Yang, Yujiu ^{[1
]}

机构：

[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China

[2] ByteDance Inc, Beijing, Peoples R China

[3] Northwestern Univ, Evanston, IL USA

来源：

COMPUTER VISION, ECCV 2022, PT XXIX | 2022年 / 13689卷

基金：

中国国家自然科学基金;

关键词：

Video object segmentation; Spectral domain;

D O I：

10.1007/978-3-031-19818-2_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper studies semi-supervised video object segmentation through boosting intra-frame interaction. Recent memory network-based methods focus on exploiting inter-frame temporal reference while paying little attention to intra-frame spatial dependency. Specifically, these segmentation model tends to be susceptible to interference from unrelated nontarget objects in a certain frame. To this end, we propose Global Spectral Filter Memory network (GSFM), which improves intraframe interaction through learning long-term spatial dependencies in the spectral domain. The key components of GSFM is 2D (inverse) discrete Fourier transform for spatial information mixing. Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head). We attribute this to semantic information extracting role for encoder and fine-grained details highlighting role for decoder. Thus, Low (High) Frequency Module is proposed to fit this circumstance. Extensive experiments on the popular DAVIS and YouTube-VOS benchmarks demonstrate that GSFM noticeably outperforms the baseline method and achieves state-of-the-art performance. Besides, extensive analysis shows that the proposed modules are reasonable and of great generalization ability.

引用

页码：648 / 665

页数：18

共 50 条

[31] Video object segmentation via couple streams and feature memory
Liang, Yun
Xiao, Xinjie
Qiu, Shaojian
Zhang, Yuqing
Su, Zhuo
IET IMAGE PROCESSING, 2024, 18 (09) : 2257 - 2272
[32] Local Memory Read-and-Comparator for Video Object Segmentation
Heo, Yuk
Koh, Yeong Jun
Kim, Chang-Su
IEEE ACCESS, 2022, 10 : 90004 - 90016
[33] Attention-Guided Memory Model for Video Object Segmentation
Lin, Yunjian
Tan, Yihua
Communications in Computer and Information Science, 2022, 1566 CCIS : 67 - 85
[34] ASDeM: Augmenting SAM With Decoupled Memory for Video Object Segmentation
Liu, Xiaohu
Luo, Yichuang
Sun, Wei
IEEE ACCESS, 2024, 12 : 73218 - 73227
[35] Video object plane segmentation using a morphological motion filter and Hausdorff object tracking
Meier, T
Ngan, KN
1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 2, 1998, : 652 - 656
[36] Temporally Consistent Referring Video Object Segmentation With Hybrid Memory
Miao, Bo
Bennamoun, Mohammed
Gao, Yongsheng
Shah, Mubarak
Mian, Ajmal
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11373 - 11385
[37] Global Memory and Local Continuity for Video Object Detection
Han, Liang
Yin, Zhaozheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3681 - 3693
[38] Robust global motion estimation oriented to video object segmentation
Qi, Bin
Ghazal, Mohammed
Amer, Aishy
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2008, 17 (06) : 958 - 967
[39] Video object segmentation based on global motion estimation/compensation
Hsu, CT
Tsai, YS
Hsieh, MH
Chien, YN
ICCE: 2001 INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, DIGEST OF TECHNICAL PAPERS, 2001, : 168 - 169
[40] Video Object Segmentation Using Global and Instance Embedding Learning
Ge, Wenbin
Lu, Xiankai
Shen, Jianbing
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16831 - 16840

← 1 2 3 4 5 →