Scripted Video Generation With a Bottom-Up Generative Adversarial Network

被引:14
|
作者
Chen, Qi [1 ,2 ]
Wu, Qi [3 ]
Chen, Jian [1 ]
Wu, Qingyao [1 ]
van den Hengel, Anton [3 ]
Tan, Mingkui [1 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou 510640, Peoples R China
[2] Pazhou Lab, Guangzhou 510335, Peoples R China
[3] Univ Adelaide, Sch Comp Sci, Adelaide, SA 5005, Australia
基金
中国国家自然科学基金;
关键词
Generative adversarial networks; video generation; semantic alignment; temporal coherence;
D O I
10.1109/TIP.2020.3003227
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating videos given a text description (such as a script) is non-trivial due to the intrinsic complexity of image frames and the structure of videos. Although Generative Adversarial Networks (GANs) have been successfully applied to generate images conditioned on a natural language description, it is still very challenging to generate realistic videos in which the frames are required to follow both spatial and temporal coherence. In this paper, we propose a novel Bottom-up GAN (BoGAN) method for generating videos given a text description. To ensure the coherence of the generated frames and also make the whole video match the language descriptions semantically, we design a bottom-up optimisation mechanism to train BoGAN. Specifically, we devise a region-level loss via attention mechanism to preserve the local semantic alignment and draw details in different sub-regions of video conditioned on words which are most relevant to them. Moreover, to guarantee the matching between text and frame, we introduce a frame-level discriminator, which can also maintain the fidelity of each frame and the coherence across frames. Last, to ensure the global semantic alignment between whole video and given text, we apply a video-level discriminator. We evaluate the effectiveness of the proposed BoGAN on two synthetic datasets (i.e., SBMG and TBMG) and two real-world datasets (i.e., MSVD and KTH).
引用
收藏
页码:7454 / 7467
页数:14
相关论文
共 50 条
  • [11] Targeted Speech Adversarial Example Generation With Generative Adversarial Network
    Wang, Donghua
    Dong, Li
    Wang, Rangding
    Yan, Diqun
    Wang, Jie
    IEEE ACCESS, 2020, 8 (08): : 124503 - 124513
  • [12] Bottom-Up Generative Modeling of Tree-Structured Data
    Bacciu, Davide
    Micheli, Alessio
    Sperduti, Alessandro
    NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 : 660 - +
  • [13] A FLEXIBLE BOTTOM-UP APPROACH FOR LAYOUT GENERATION
    VANLIEROP, MLP
    INTEGRATION-THE VLSI JOURNAL, 1985, 3 (01) : 49 - 59
  • [14] VCGAN: Video Colorization With Hybrid Generative Adversarial Network
    Zhao, Yuzhi
    Po, Lai-Man
    Yu, Wing-Yin
    Rehman, Yasar Abbas Ur
    Liu, Mengyang
    Zhang, Yujia
    Ou, Weifeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3017 - 3032
  • [15] Video quality assessment using generative adversarial network
    Voronin, V.
    Franc, V
    Zelensky, A.
    Agaian, S.
    MOBILE MULTIMEDIA/IMAGE PROCESSING, SECURITY, AND APPLICATIONS 2019, 2019, 10993
  • [16] A dual adversarial structure of generative adversarial network for nature language generation
    Sue, Kuen-Liang
    Chen, Yi-Cheng
    INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2025, 125 (04) : 1279 - 1305
  • [17] VideoTrain: A Generative Adversarial Framework for Synthetic Video Traffic Generation
    Kattadige, Chamara
    Muramudalige, Shashika R.
    Choi, Kwon Nung
    Jourjon, Guillaume
    Wang, Haonan
    Jayasumana, Anura
    Thilakarathna, Kanchana
    2021 IEEE 22ND INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS (WOWMOM 2021), 2021, : 209 - 218
  • [18] ViGAT: Bottom-Up Event Recognition and Explanation in Video Using Factorized Graph Attention Network
    Gkalelis, Nikolaos
    Daskalakis, Dimitrios
    Mezaris, Vasileios
    IEEE ACCESS, 2022, 10 : 108797 - 108816
  • [19] Generative adversarial networks for network traffic feature generation
    Anande T.J.
    Al-Saadi S.
    Leeson M.S.
    International Journal of Computers and Applications, 2023, 45 (04) : 297 - 305
  • [20] Multistage Evolutionary Generative Adversarial Network for Image Generation
    Zhang, Xiu
    Sun, Baiwei
    Zhang, Xin
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (03) : 5483 - 5492