Efficient Regional Memory Network for Video Object Segmentation

被引:103
|
作者
Xie, Haozhe [1 ,2 ]
Yao, Hongxun [1 ]
Zhou, Shangchen [3 ]
Zhang, Shengping [1 ,4 ]
Sun, Wenxiu [2 ,5 ]
机构
[1] Harbin Inst Technol, Harbin, Peoples R China
[2] SenseTime Res & Tetras AI, Hong Kong, Peoples R China
[3] Nanyang Technol Univ, S Lab, Singapore, Singapore
[4] Peng Cheng Lab, Shenzhen, Peoples R China
[5] Shanghai AI Lab, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR46437.2021.00134
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, several Space-Time Memory based networks have shown that the object cues (e.g. video frames as well as the segmented object masks) from the past frames are useful for segmenting objects in the current frame. However, these methods exploit the information from the memory by global-to-global matching between the current and past frames, which lead to mismatching to similar objects and high computational complexity. To address these problems, we propose a novel local-to-local matching solution for semi-supervised VOS, namely Regional Memory Network (RMNet). In RMNet, the precise regional memory is constructed by memorizing local regions where the target objects appear in the past frames. For the current query frame, the query regions are tracked and predicted based on the optical flow estimated from the previous frame. The proposed local-to-local matching effectively alleviates the ambiguity of similar objects in both memory and query frames, which allows the information to be passed from the regional memory to the query region efficiently and effectively. Experimental results indicate that the proposed RMNet performs favorably against state-of-the-art methods on the DAVIS and YouTube-VOS datasets.
引用
收藏
页码:1286 / 1295
页数:10
相关论文
共 50 条
  • [1] Robust and Efficient Memory Network for Video Object Segmentation
    Chen, Yadang
    Zhang, Dingwei
    Yang, Zhi-Xin
    Wu, Enhua
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1769 - 1774
  • [2] Dual Temporal Memory Network for Efficient Video Object Segmentation
    Zhang, Kaihua
    Wang, Long
    Liu, Dong
    Liu, Bo
    Liu, Qingshan
    Li, Zhu
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1515 - 1523
  • [3] Boosting Video Object Segmentation via Robust and Efficient Memory Network
    Chen, Yadang
    Zhang, Dingwei
    Zheng, Yuhui
    Yang, Zhi-Xin
    Wu, Enhua
    Zhao, Haixing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3340 - 3352
  • [4] Modulated Memory Network for Video Object Segmentation
    Lu, Hannan
    Guo, Zixian
    Zuo, Wangmeng
    MATHEMATICS, 2024, 12 (06)
  • [5] Hierarchical Memory Matching Network for Video Object Segmentation
    Seong, Hongje
    Oh, Seoung Wug
    Lee, Joon-Young
    Lee, Seongwon
    Lee, Suhyeon
    Kim, Euntai
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12869 - 12878
  • [6] Efficient Video Object Segmentation via Network Modulation
    Yang, Linjie
    Wang, Yanran
    Xiong, Xuehan
    Yang, Jianchao
    Katsaggelos, Aggelos K.
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6499 - 6507
  • [7] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
    Miao, Jiaxu
    Wei, Yunchao
    Yang, Yi
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10363 - 10372
  • [8] Unsupervised Video Object Segmentation via Prototype Memory Network
    Yonsei University, Korea, Republic of
    不详
    Proc. - IEEE Winter Conf. Appl. Comput. Vis., WACV, 1600, (5913-5923):
  • [9] Unsupervised Video Object Segmentation via Prototype Memory Network
    Lee, Minhyeok
    Cho, Suhwan
    Lee, Seunghoon
    Park, Chaewon
    Lee, Sangyoun
    arXiv, 2022,
  • [10] Global Spectral Filter Memory Network for Video Object Segmentation
    Liu, Yong
    Yu, Ran
    Wang, Jiahao
    Zhao, Xinyuan
    Wang, Yitong
    Tang, Yansong
    Yang, Yujiu
    COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 648 - 665