Conditional Temporal Variational AutoEncoder for Action Video Prediction

被引:2
|
作者
Xu, Xiaogang [1 ]
Wang, Yi [2 ]
Wang, Liwei [3 ]
Yu, Bei [3 ]
Jia, Jiaya [3 ]
机构
[1] Zhejiang Lab, Hangzhou, Zhejiang, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
Variational AutoEncoder; Action modeling; Temporal coherence; Adversarial learning;
D O I
10.1007/s11263-023-01832-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To synthesize a realistic action sequence based on a single human image, it is crucial to model both motion patterns and diversity in the action video. This paper proposes an Action Conditional Temporal Variational AutoEncoder (ACT-VAE) to improve motion prediction accuracy and capture movement diversity. ACT-VAE predicts pose sequences for an action clip from a single input image. It is implemented as a deep generative model that maintains temporal coherence according to the action category with a novel temporal modeling on latent space. Further, ACT-VAE is a general action sequence prediction framework. When connected with a plug-and-play Pose-to-Image network, ACT-VAE can synthesize image sequences. Extensive experiments bear out our approach can predict accurate pose and synthesize realistic image sequences, surpassing state-of-the-art approaches. Compared to existing methods, ACT-VAE improves model accuracy and preserves diversity.
引用
收藏
页码:2699 / 2722
页数:24
相关论文
共 50 条
  • [41] A generative design method of airfoil based on conditional variational autoencoder
    Wang, Xu
    Qian, Weiqi
    Zhao, Tun
    Chen, Hai
    He, Lei
    Sun, Haisheng
    Tian, Yuan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139
  • [42] A boosting resampling method for regression based on a conditional variational autoencoder
    Huang, Yang
    Liu, Duen-Ren
    Lee, Shin-Jye
    Hsu, Chia-Hao
    Liu, Yang-Guang
    INFORMATION SCIENCES, 2022, 590 : 90 - 105
  • [43] Hierarchical Conditional Variational Autoencoder Based Acoustic Anomaly Detection
    Purohit, Harsh
    Endo, Takashi
    Yamamoto, Masaaki
    Kawaguchi, Yohei
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 274 - 278
  • [44] Predicting Head Pose from Speech with a Conditional Variational Autoencoder
    Greenwood, David
    Laycock, Stephen
    Matthews, Iain
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3991 - 3995
  • [45] A Data Reconstruction Method based on Adversarial Conditional Variational Autoencoder
    Ren, Yifu
    Liu, Jinhai
    Zhang, Jianan
    Jiang, Lin
    Luo, Yanhong
    PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 622 - 626
  • [46] Depth-Aware Object Tracking With a Conditional Variational Autoencoder
    Huang, Wenhui
    Gu, Jason
    Guo, Yinchen
    IEEE ACCESS, 2021, 9 : 94537 - 94547
  • [47] Traffic Flow Prediction Model Based on Spatio-Temporal Feature Distillation Variational Autoencoder
    Ouyang, Yi
    Tang, Wen-Yan
    Li, Yan-Ling
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (06): : 1938 - 1944
  • [48] Aligned variational autoencoder for matching danmaku and video storylines
    Bai, Qingchun
    Wu, Yuanbin
    Zhou, Jie
    He, Liang
    NEUROCOMPUTING, 2021, 454 : 228 - 237
  • [49] Sequential Variational Autoencoder with Adversarial Classifier for Video Disentanglement
    Haga, Takeshi
    Kera, Hiroshi
    Kawamoto, Kazuhiko
    SENSORS, 2023, 23 (05)
  • [50] Heterogeneous Hypergraph Variational Autoencoder for Link Prediction
    Fan, Haoyi
    Zhang, Fengbin
    Wei, Yuxuan
    Li, Zuoyong
    Zou, Changqing
    Gao, Yue
    Dai, Qionghai
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (08) : 4125 - 4138