Deep Learning in Latent Space for Video Prediction and Compression

被引:45
|
作者
Liu, Bowen [1 ]
Chen, Yu [1 ]
Liu, Shiyu [1 ]
Kim, Hun-Seok [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
EVENT DETECTION;
D O I
10.1109/CVPR46437.2021.00076
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.(1)
引用
收藏
页码:701 / 710
页数:10
相关论文
共 50 条
  • [41] A deep latent space model for interpretable representation learning on directed graphs
    Yang, Hanxuan
    Kong, Qingchao
    Mao, Wenji
    NEUROCOMPUTING, 2024, 576
  • [42] Deep feature learning and latent space encoding for crop phenology analysis
    Pattathal, Arun, V
    Karnieli, Arnon
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
  • [43] Deep Contextual Video Compression
    Li, Jiahao
    Li, Bin
    Lu, Yan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [44] JMPNET: JOINT MOTION PREDICTION FOR LEARNING-BASED VIDEO COMPRESSION
    Li, Dongyang
    Sun, Zhenhong
    Tan, Zhiyu
    Sun, Xiuyu
    Zhang, Fangyi
    Qian, Yichen
    Li, Hao
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1855 - 1859
  • [45] Robot skill learning in latent space of a deep autoencoder neural network
    Pahic, Rok
    Loncarevic, Zvezdan
    Gams, Andrej
    Ude, Ales
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 135
  • [46] Deep Learning for Latent Space Data Assimilation in Subsurface Flow Systems
    Razak, Syamil Mohd
    Jahandideh, Atefeh
    Djuraev, Ulugbek
    Jafarpour, Behnam
    SPE JOURNAL, 2022, 27 (05): : 2820 - 2840
  • [47] Deep Learning Geometry Compression Artifacts Removal for Video-Based Point Cloud Compression
    Jia, Wei
    Li, Li
    Li, Zhu
    Liu, Shan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (11) : 2947 - 2964
  • [48] Deep Learning based Prediction Model for Adaptive Video Streaming
    Lekharu, Anirban
    Moulii, K. Y.
    Sur, Arijit
    Sarkar, Arnab
    2020 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2020,
  • [49] Deep Learning Geometry Compression Artifacts Removal for Video-Based Point Cloud Compression
    Wei Jia
    Li Li
    Zhu Li
    Shan Liu
    International Journal of Computer Vision, 2021, 129 : 2947 - 2964
  • [50] TRANSFER LEARNING WITH DEEP NETWORKS FOR SALIENCY PREDICTION IN NATURAL VIDEO
    Chaabouni, Souad
    Benois-Pineau, Jenny
    Ben Amari, Chokri
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 1604 - 1608