Learning Video Object Segmentation with Visual Memory

被引:205
|
作者
Tokmakov, Pavel [1 ]
Inria, Karteek Alahari [1 ]
Schmid, Cordelia [1 ]
机构
[1] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP,LJK, F-38000 Grenoble, France
关键词
D O I
10.1109/ICCV.2017.480
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a "visual memory" in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given a video frame as input, our approach assigns each pixel an object or background label based on the learned spatio-temporal features as well as the "visual memory" specific to the video, acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. For example, our approach outperforms the top method on the DAVIS dataset by nearly 6%. We also provide an extensive ablative analysis to investigate the influence of each component in the proposed framework.
引用
收藏
页码:4491 / 4500
页数:10
相关论文
共 50 条
  • [1] Learning Unsupervised Video Object Segmentation through Visual Attention
    Wang, Wenguan
    Song, Hongmei
    Zhao, Shuyang
    Shen, Jianbing
    Zhao, Sanyuan
    Hoi, Steven C. H.
    Ling, Haibin
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3059 - 3069
  • [2] Learning Quality-aware Dynamic Memory for Video Object Segmentation
    Liu, Yong
    Yu, Ran
    Yin, Fei
    Zhao, Xinyuan
    Zhao, Wei
    Xia, Weihao
    Yang, Yujiu
    COMPUTER VISION, ECCV 2022, PT XXIX, 2022, 13689 : 468 - 486
  • [3] Learning effective feature representation for video object segmentation via memory
    Li, Jun
    Sun, Lijuan
    Ren, Hengyi
    Cao, Ying
    Li, Suya
    Xie, Xin
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [4] Meta-Learning Deep Visual Words for Fast Video Object Segmentation
    Behl, Harkirat Singh
    Najafi, Mohammad
    Arnab, Anurag
    Torr, Philip H. S.
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8484 - 8491
  • [5] Visual Attention Guided Video Object Segmentation
    Liang, Hao
    Tan, Yihua
    PROCEEDINGS OF THE 2019 14TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2019), 2019, : 345 - 349
  • [6] Learning Position and Target Consistency for Memory-based Video Object Segmentation
    Hu, Li
    Zhang, Peng
    Zhang, Bang
    Pan, Pan
    Xu, Yinghui
    Jin, Rong
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4142 - 4152
  • [7] Adaptive Memory Management for Video Object Segmentation
    Pourganjalikhan, Ali
    Poullis, Charalambos
    2022 19TH CONFERENCE ON ROBOTS AND VISION (CRV 2022), 2022, : 75 - 82
  • [8] Modulated Memory Network for Video Object Segmentation
    Lu, Hannan
    Guo, Zixian
    Zuo, Wangmeng
    MATHEMATICS, 2024, 12 (06)
  • [9] Video object motion segmentation for intelligent visual surveillance
    Jiang, M.
    Crookes, D.
    IMVIP 2007: INTERNATIONAL MACHINE VISION AND IMAGE PROCESSING CONFERENCE, PROCEEDINGS, 2007, : 202 - 202
  • [10] Adaptive Online Learning for Video Object Segmentation
    Wei, Li
    Xu, Chunyan
    Zhang, Tong
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: VISUAL DATA ENGINEERING, PT I, 2019, 11935 : 22 - 34