Deep Learning in Latent Space for Video Prediction and Compression

被引:45
|
作者
Liu, Bowen [1 ]
Chen, Yu [1 ]
Liu, Shiyu [1 ]
Kim, Hun-Seok [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
EVENT DETECTION;
D O I
10.1109/CVPR46437.2021.00076
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.(1)
引用
收藏
页码:701 / 710
页数:10
相关论文
共 50 条
  • [21] Deep Learning for Image/Video Compression and Visual Quality Assessment
    Multimedia Tools and Applications, 2022, 81 : 42483 - 42483
  • [22] Deep Learning for Image/Video Compression and Visual Quality Assessment
    Pan, Zhaoqing
    Jeon, Byeungwoo
    Ling, Nam
    Peng, Bo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42483 - 42483
  • [23] High Efficiency Deep-learning Based Video Compression
    Tang, Lv
    Zhang, Xinfeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (08)
  • [24] Latent Matters: Learning Deep State-Space Models
    Klushyn, Alexej
    Kurle, Richard
    Soelch, Maximilian
    Cseke, Botond
    van der Smagt, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [25] Deep Learning Protein Conformational Space with Convolutions and Latent Interpolations
    Ramaswamy, Venkata K.
    Musson, Samuel C.
    Willcocks, Chris G.
    Degiacomi, Matteo T.
    PHYSICAL REVIEW X, 2021, 11 (01)
  • [26] Revisiting Video Saliency Prediction in the Deep Learning Era
    Wang, Wenguan
    Shen, Jianbing
    Xie, Jianwen
    Cheng, Ming-Ming
    Ling, Haibin
    Borji, Ali
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 220 - 237
  • [27] FVC: A New Framework towards Deep Video Compression in Feature Space
    Hu, Zhihao
    Lu, Guo
    Xu, Dong
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1502 - 1511
  • [28] Scene-Dependent Prediction in Latent Space for Video Anomaly Detection and Anticipation
    Cao, Congqi
    Zhang, Hanwen
    Lu, Yue
    Wang, Peng
    Zhang, Yanning
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 224 - 239
  • [29] Learning for Video Compression
    Chen, Zhibo
    He, Tianyu
    Jin, Xin
    Wu, Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (02) : 566 - 576
  • [30] Guest Editorial: Special Issue on Deep Learning for Video Analysis and Compression
    Dong Xu
    Rama Chellappa
    Luc Van Gool
    Guo Lu
    International Journal of Computer Vision, 2021, 129 : 3171 - 3173