XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

被引:118
|
作者
Cheng, Ho Kei [1 ]
Schwing, Alexander G. [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
来源
COMPUTER VISION - ECCV 2022, PT XXVIII | 2022年 / 13688卷
关键词
D O I
10.1007/978-3-031-19815-1_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video object segmentation typically only uses one type of feature memory. For videos longer than a minute, a single feature memory model tightly links memory consumption and accuracy. In contrast, following the Atkinson-Shiffrin model, we develop an architecture that incorporates multiple independent yet deeply-connected feature memory stores: a rapidly updated sensory memory, a high-resolution working memory, and a compact thus sustained long-term memory. Crucially, we develop a memory potentiation algorithm that routinely consolidates actively used working memory elements into the long-term memory, which avoids memory explosion and minimizes performance decay for long-term prediction. Combined with a new memory reading mechanism, XMem greatly exceeds state-of-the-art performance on long-video datasets while being on par with state-of-the-art methods (that do not work on long videos) on short-video datasets.
引用
收藏
页码:640 / 658
页数:19
相关论文
共 50 条
  • [41] Unsupervised Video Object Segmentation via Prototype Memory Network
    Yonsei University, Korea, Republic of
    不详
    Proc. - IEEE Winter Conf. Appl. Comput. Vis., WACV, 1600, (5913-5923):
  • [42] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
    Miao, Jiaxu
    Wei, Yunchao
    Yang, Yi
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10363 - 10372
  • [43] Global Spectral Filter Memory Network for Video Object Segmentation
    Liu, Yong
    Yu, Ran
    Wang, Jiahao
    Zhao, Xinyuan
    Wang, Yitong
    Tang, Yansong
    Yang, Yujiu
    COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 648 - 665
  • [44] Video object segmentation via couple streams and feature memory
    Liang, Yun
    Xiao, Xinjie
    Qiu, Shaojian
    Zhang, Yuqing
    Su, Zhuo
    IET IMAGE PROCESSING, 2024, 18 (09) : 2257 - 2272
  • [45] Unsupervised Video Object Segmentation via Prototype Memory Network
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Seunghoon
    Park, Chaewon
    Lee, Sangyoun
    arXiv, 2022,
  • [46] Local Memory Read-and-Comparator for Video Object Segmentation
    Heo, Yuk
    Koh, Yeong Jun
    Kim, Chang-Su
    IEEE ACCESS, 2022, 10 : 90004 - 90016
  • [47] ASDeM: Augmenting SAM With Decoupled Memory for Video Object Segmentation
    Liu, Xiaohu
    Luo, Yichuang
    Sun, Wei
    IEEE ACCESS, 2024, 12 : 73218 - 73227
  • [48] Dual Temporal Memory Network for Efficient Video Object Segmentation
    Zhang, Kaihua
    Wang, Long
    Liu, Dong
    Liu, Bo
    Liu, Qingshan
    Li, Zhu
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1515 - 1523
  • [49] Unsupervised Video Object Segmentation via Prototype Memory Network
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Seunghoon
    Park, Chaewon
    Lee, Sangyoun
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5913 - 5923
  • [50] Temporally Consistent Referring Video Object Segmentation With Hybrid Memory
    Miao, Bo
    Bennamoun, Mohammed
    Gao, Yongsheng
    Shah, Mubarak
    Mian, Ajmal
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11373 - 11385