Learning Video Object Segmentation with Visual Memory

被引:205
|
作者
Tokmakov, Pavel [1 ]
Inria, Karteek Alahari [1 ]
Schmid, Cordelia [1 ]
机构
[1] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP,LJK, F-38000 Grenoble, France
关键词
D O I
10.1109/ICCV.2017.480
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a "visual memory" in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. Given a video frame as input, our approach assigns each pixel an object or background label based on the learned spatio-temporal features as well as the "visual memory" specific to the video, acquired automatically without any manually-annotated frames. The visual memory is implemented with convolutional gated recurrent units, which allows to propagate spatial information over time. We evaluate our method extensively on two benchmarks, DAVIS and Freiburg-Berkeley motion segmentation datasets, and show state-of-the-art results. For example, our approach outperforms the top method on the DAVIS dataset by nearly 6%. We also provide an extensive ablative analysis to investigate the influence of each component in the proposed framework.
引用
收藏
页码:4491 / 4500
页数:10
相关论文
共 50 条
  • [21] Automatic Video Object Segmentation Based on Visual and Motion Saliency
    Peng, Qinmu
    Cheung, Yiu-Ming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (12) : 3083 - 3094
  • [22] Learning Video Object Segmentation from Static Images
    Perazzi, Federico
    Khoreva, Anna
    Benenson, Rodrigo
    Schiele, Bernt
    Sorkine-Hornung, Alexander
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3491 - 3500
  • [23] Application and Prospect of Deep Learning in Video Object Segmentation
    Chen J.
    Chen Y.-S.
    Li W.-H.
    Tian Y.
    Liu Z.
    He Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (03): : 609 - 631
  • [24] Video object segmentation and tracking using ψ-learning classification
    Liu, Y
    Zheng, YF
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2005, 15 (07) : 885 - 899
  • [25] Going Deeper into Embedding Learning for Video Object Segmentation
    Yang, Zongxin
    Li, Peike
    Feng, Qianyu
    Wei, Yunchao
    Yang, Yi
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 697 - 700
  • [26] Unsupervised Video Object Segmentation for Deep Reinforcement Learning
    Goel, Vik
    Weng, Jameson
    Poupart, Pascal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [27] LIP: Learning Instance Propagation for Video Object Segmentation
    Lyu, Ye
    Vosselman, George
    Xia, Gui-Song
    Yang, Michael Ying
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2739 - 2748
  • [28] Joint Inductive and Transductive Learning for Video Object Segmentation
    Mao, Yunyao
    Wang, Ning
    Zhou, Wengang
    Li, Houqiang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9650 - 9659
  • [29] Deep Reinforcement Learning for Object Segmentation in Video Sequences
    Sahba, Farhang
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 857 - 860
  • [30] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
    Miao, Jiaxu
    Wei, Yunchao
    Yang, Yi
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 10363 - 10372