Conditional Temporal Variational AutoEncoder for Action Video Prediction

被引:2
|
作者
Xu, Xiaogang [1 ]
Wang, Yi [2 ]
Wang, Liwei [3 ]
Yu, Bei [3 ]
Jia, Jiaya [3 ]
机构
[1] Zhejiang Lab, Hangzhou, Zhejiang, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
Variational AutoEncoder; Action modeling; Temporal coherence; Adversarial learning;
D O I
10.1007/s11263-023-01832-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video. This paper proposes an Action Conditional Temporal Variational AutoEncoder (ACT-VAE) to improve motion prediction accuracy and capture movement diversity. ACT-VAE predicts pose sequences for an action clip from a single input image. It is implemented as a deep generative model that maintains temporal coherence according to the action category with a novel temporal modeling on latent space. Further, ACT-VAE is a general action sequence prediction framework. When connected with a plug-and-play Pose-to-Image network, ACT-VAE can synthesize image sequences. Extensive experiments bear out our approach can predict accurate pose and synthesize realistic image sequences, surpassing state-of-the-art approaches. Compared to existing methods, ACT-VAE improves model accuracy and preserves diversity.
引用
收藏
页码:2699 / 2722
页数:24
相关论文
共 50 条
  • [31] Conditional Deep Hierarchical Variational Autoencoder for Voice Conversion
    Akuzawa, Kei
    Onishi, Kotaro
    Takiguchi, Keisuke
    Mametani, Kohki
    Mori, Koichiro
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 808 - 813
  • [32] A Variational Autoencoder for Heterogeneous Temporal and Longitudinal Data
    Ogretir, Mine
    Ramchandran, Siddharth
    Papatheodorou, Dimitrios
    Lahdesmaki, Harri
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1522 - 1529
  • [33] Bike-Share Demand Prediction using Attention based Sequence to Sequence and Conditional Variational AutoEncoder
    Mimura, Tomohiro
    Ishiguro, Shin
    Kawasaki, Satoshi
    Fukazawa, Yusuke
    PREDICTGIS 2019: PROCEEDINGS OF THE 3RD ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON PREDICTION OF HUMAN MOBILITY (PREDICTGIS 2019), 2019, : 41 - 44
  • [34] Tumour growth prediction of follow-up lung cancer via conditional recurrent variational autoencoder
    Xiao, Ning
    Qiang, Yan
    Zhao, Zijuan
    Zhao, Juanjuan
    Lian, Jianhong
    IET IMAGE PROCESSING, 2020, 14 (15) : 3975 - 3981
  • [35] Generating multivariate load states using a conditional variational autoencoder
    Wang, Chenguang
    Sharifnia, Ensieh
    Gao, Zhi
    Tindemans, Simon H.
    Palensky, Peter
    ELECTRIC POWER SYSTEMS RESEARCH, 2022, 213
  • [36] Sensing anomaly of photovoltaic systems with sequential conditional variational autoencoder
    Li, Ding
    Zhang, Yufei
    Yang, Zheng
    Jin, Yaohui
    Xu, Yanyan
    APPLIED ENERGY, 2024, 353
  • [37] Emotional Dialogue Generation Based on Transformer and Conditional Variational Autoencoder
    Lin, Hongquan
    Deng, Zhenrong
    2022 IEEE 21ST INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS, IUCC/CIT/DSCI/SMARTCNS, 2022, : 386 - 393
  • [38] Exploiting Probabilistic Siamese Visual Tracking with a Conditional Variational Autoencoder
    Huang, Wenhui
    Gu, Jason
    Duan, Peiyong
    Hou, Sujuan
    Zheng, Yuanjie
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14213 - 14219
  • [39] A Kalman Variational Autoencoder Model Assisted by Odometric Clustering for Video Frame Prediction and Anomaly Detection
    Slavic, Giulia
    Alemaw, Abrham Shiferaw
    Marcenaro, Lucio
    Gomez, David Martin
    Regazzoni, Carlo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 415 - 429
  • [40] Echo-State Conditional Variational Autoencoder for Anomaly Detection
    Suh, Suwon
    Chae, Daniel H.
    Kang, Hyon-Goo
    Choi, Seungjin
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1015 - 1022