Recurrent Dynamic Embedding for Video Object Segmentation

被引:37
|
作者
Li, Mingxing [1 ,3 ]
Hu, Li [2 ]
Xiong, Zhiwei [1 ]
Zhang, Bang [2 ]
Pan, Pan [2 ]
Liu, Dong [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Alibaba Grp, Alibaba DAMO Acad, Hangzhou, Peoples R China
[3] Alibaba, Hangzhou, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52688.2022.00139
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Space-time memory (STM) based video object segmentation (VOS) networks usually keep increasing memory bank every several frames, which shows excellent performance. However; 1) the hardware cannot withstand the ever-increasing memory requirements as the video length increases. 2) Storing lots of information inevitably introduces lots of noise, which is not conducive to reading the most important information from the memory bank In this paper, we propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size. Specifically, we explicitly generate and update RDE by the proposed Spatio-temporal Aggregation Module (SAM), which exploits the cue of historical information. To avoid error accumulation owing to the recurrent usage of SAM, we propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos. Moreover, the predicted masks in the memory bank are inaccurate due to the inaccurate network inference, which affects the segmentation of the query frame. To address this problem, we design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank Extensive experiments show our method achieves the best tradeoff between performance and speed.
引用
收藏
页码:1322 / 1331
页数:10
相关论文
共 50 条
  • [1] Going Deeper into Embedding Learning for Video Object Segmentation
    Yang, Zongxin
    Li, Peike
    Feng, Qianyu
    Wei, Yunchao
    Yang, Yi
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 697 - 700
  • [2] Instance Embedding Transfer to Unsupervised Video Object Segmentation
    Li, Siyang
    Seybold, Bryan
    Vorobyov, Alexey
    Fathi, Alireza
    Huang, Qin
    Kuo, C. -C. Jay
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6526 - 6535
  • [3] Video Object Segmentation Using Global and Instance Embedding Learning
    Ge, Wenbin
    Lu, Xiankai
    Shen, Jianbing
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16831 - 16840
  • [4] Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation
    Yin, Yingjie
    Xu, De
    Wang, Xingang
    Zhang, Lei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3884 - 3894
  • [5] Video Object Segmentation with Dynamic Memory Networks and Adaptive Object Alignment
    Liang, Shuxian
    Shen, Xu
    Huang, Jianqiang
    Hua, Xian-Sheng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8045 - 8054
  • [6] FAST VIDEO OBJECT SEGMENTATION VIA DYNAMIC YOLACT
    Meng, Tianfang
    Zhang, Wenqiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2400 - 2404
  • [7] Video Object Segmentation of Dynamic Scenes with Large Displacements
    Zhang, Yinhui
    He, Zifen
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (09): : 1719 - 1723
  • [8] LiDAR video object segmentation with dynamic kernel refinement
    Mei, Jianbiao
    Yang, Yu
    Wang, Mengmeng
    Li, Zizhang
    Ra, Jongwon
    Liu, Yong
    PATTERN RECOGNITION LETTERS, 2024, 178 : 21 - 27
  • [9] FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation
    Voigtlaender, Paul
    Chai, Yuning
    Schroff, Florian
    Adam, Hartwig
    Leibe, Bastian
    Chen, Liang-Chieh
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9473 - 9482
  • [10] An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments
    Patil, Prashant W.
    Dudhane, Akshay
    Kulkarni, Ashutosh
    Murala, Subrahmanyam
    Gonde, Anil Balaji
    Gupta, Sunil
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7889 - 7902