Deep Learning in Latent Space for Video Prediction and Compression

被引:45
|
作者
Liu, Bowen [1 ]
Chen, Yu [1 ]
Liu, Shiyu [1 ]
Kim, Hun-Seok [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
EVENT DETECTION;
D O I
10.1109/CVPR46437.2021.00076
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.(1)
引用
收藏
页码:701 / 710
页数:10
相关论文
共 50 条
  • [1] Deep Learning Based Video Compression
    Ji, Kang Da
    Hlavacs, Helmut
    INTELLIGENT TECHNOLOGIES FOR INTERACTIVE ENTERTAINMENT, INTETAIN 2021, 2022, 429 : 127 - 141
  • [2] DEEP REINFORCEMENT LEARNING FOR VIDEO PREDICTION
    Ho, Yung-Han
    Cho, Chuan-Yuan
    Peng, Wen-Hsiao
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 604 - 608
  • [3] Visualization and Interpretation of Latent Space in Deep Learning
    Dai, Mizuki
    Jin'no, Kenya
    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION, PT II, HIMI 2024, 2024, 14690 : 14 - 23
  • [4] DASC: Learning discriminative latent space for video clustering
    Lin, Jiaxin
    Gao, Xizhan
    Zhang, Zhihan
    Deng, Haotian
    NEUROCOMPUTING, 2025, 637
  • [5] Learning to Disentangle Latent Physical Factors for Video Prediction
    Zhu, Deyao
    Munderloh, Marco
    Rosenhahn, Bodo
    Stueckle, Joerg
    PATTERN RECOGNITION, DAGM GCPR 2019, 2019, 11824 : 595 - 608
  • [6] Deep Multiframe Enhancement for Motion Prediction in Video Compression
    Prette, Nicola
    Valsesia, Diego
    Bianchi, Tiziano
    2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [7] Deep Learning Approaches for Video Compression: A Bibliometric Analysis
    Bidwe, Ranjeet Vasant
    Mishra, Sashikala
    Patil, Shruti
    Shaw, Kailash
    Vora, Deepali Rahul
    Kotecha, Ketan
    Zope, Bhushan
    BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (02)
  • [8] Deep Learning-Assisted Video Compression Framework
    Man, Hengyu
    Yu, Chang
    Xing, Feng
    Cheng, Yang
    Zheng, Bo
    Fan, Xiaopeng
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 3210 - 3214
  • [9] Deep network compression with teacher latent subspace learning and LASSO
    Oyebade K. Oyedotun
    Abd El Rahman Shabayek
    Djamila Aouada
    Björn Ottersten
    Applied Intelligence, 2021, 51 : 834 - 853
  • [10] Deep network compression with teacher latent subspace learning and LASSO
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjorn
    APPLIED INTELLIGENCE, 2021, 51 (02) : 834 - 853