XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

被引：118

作者：

Cheng, Ho Kei ^{[1
]}

Schwing, Alexander G. ^{[1
]}

机构：

[1] Univ Illinois, Champaign, IL 61820 USA

来源：

COMPUTER VISION - ECCV 2022, PT XXVIII | 2022年 / 13688卷

关键词：

D O I：

10.1007/978-3-031-19815-1_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video object segmentation typically only uses one type of feature memory. For videos longer than a minute, a single feature memory model tightly links memory consumption and accuracy. In contrast, following the Atkinson-Shiffrin model, we develop an architecture that incorporates multiple independent yet deeply-connected feature memory stores: a rapidly updated sensory memory, a high-resolution working memory, and a compact thus sustained long-term memory. Crucially, we develop a memory potentiation algorithm that routinely consolidates actively used working memory elements into the long-term memory, which avoids memory explosion and minimizes performance decay for long-term prediction. Combined with a new memory reading mechanism, XMem greatly exceeds state-of-the-art performance on long-video datasets while being on par with state-of-the-art methods (that do not work on long videos) on short-video datasets.

引用

页码：640 / 658

页数：19

共 50 条

[41] Unsupervised Video Object Segmentation via Prototype Memory Network
Yonsei University, Korea, Republic of
不详
Proc. - IEEE Winter Conf. Appl. Comput. Vis., WACV, 1600, (5913-5923):
[42] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
Miao, Jiaxu
Wei, Yunchao
Yang, Yi
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10363 - 10372
[43] Global Spectral Filter Memory Network for Video Object Segmentation
Liu, Yong
Yu, Ran
Wang, Jiahao
Zhao, Xinyuan
Wang, Yitong
Tang, Yansong
Yang, Yujiu
COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 648 - 665
[44] Video object segmentation via couple streams and feature memory
Liang, Yun
Xiao, Xinjie
Qiu, Shaojian
Zhang, Yuqing
Su, Zhuo
IET IMAGE PROCESSING, 2024, 18 (09) : 2257 - 2272
[45] Unsupervised Video Object Segmentation via Prototype Memory Network
Lee, Minhyeok
Cho, Suhwan
Lee, Seunghoon
Park, Chaewon
Lee, Sangyoun
arXiv, 2022,
[46] Local Memory Read-and-Comparator for Video Object Segmentation
Heo, Yuk
Koh, Yeong Jun
Kim, Chang-Su
IEEE ACCESS, 2022, 10 : 90004 - 90016
[47] ASDeM: Augmenting SAM With Decoupled Memory for Video Object Segmentation
Liu, Xiaohu
Luo, Yichuang
Sun, Wei
IEEE ACCESS, 2024, 12 : 73218 - 73227
[48] Dual Temporal Memory Network for Efficient Video Object Segmentation
Zhang, Kaihua
Wang, Long
Liu, Dong
Liu, Bo
Liu, Qingshan
Li, Zhu
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1515 - 1523
[49] Unsupervised Video Object Segmentation via Prototype Memory Network
Lee, Minhyeok
Cho, Suhwan
Lee, Seunghoon
Park, Chaewon
Lee, Sangyoun
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5913 - 5923
[50] Temporally Consistent Referring Video Object Segmentation With Hybrid Memory
Miao, Bo
Bennamoun, Mohammed
Gao, Yongsheng
Shah, Mubarak
Mian, Ajmal
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11373 - 11385

← 1 2 3 4 5 →