Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks

被引:0
|
作者
Dong, Pei [1 ]
Wu, Lei [1 ]
Li, Ruichen [1 ]
Meng, Xiangxu [1 ]
Meng, Lei [1 ]
机构
[1] Shandong Univ, Sch Software, 1500 ShunHua Rd High Tech Ind Dev Zone, Jinan 250101, Peoples R China
关键词
Generative adversarial network; Multi-granularity feature aware enhancement; Text-to-image; Autoregressive; Diffusion;
D O I
10.1016/j.cviu.2024.104042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Synthesizing complex images from text presents challenging. Compared to autoregressive and diffusion modelbased methods, Generative Adversarial Network -based methods have significant advantages in terms of computational cost and generation efficiency yet remain two limitations: first, these methods often refine all features output from the previous stage indiscriminately, without considering these features are initialized gradually during the generation process; second, the sparse semantic constraints provided by the text description are typically ineffective for refining fine-grained features. These issues complicate the balance between generation quality, computational cost and inference speed. To address these issues, we propose a Multi -granularity Feature Aware Enhancement GAN (MFAE-GAN), which allows the refinement process to match the order of different granularity features being initialized. Specifically, MFAE-GAN (1) samples category -related coarse -grained features and instance -level detail -related fine-grained features at different generation stages based on different attention mechanisms in Coarse -grained Feature Enhancement (CFE) and Fine-grained Feature Enhancement (FFE) to guide the generation process spatially, (2) provides denser semantic constraints than textual semantic information through Multi -granularity Features Adaptive Batch Normalization (MFA-BN) in the process of refining fine-grained features, and (3) adopts a Global Semantics Preservation (GSP) to avoid the loss of global semantics when sampling features continuously. Extensive experimental results demonstrate that our MFAE-GAN is competitive in terms of both image generation quality and efficiency.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] MRP-GAN: Multi-resolution parallel generative adversarial networks for text-to-image synthesis
    Qi, Zhongjian
    Fan, Chaogang
    Xu, Liangfeng
    Li, Xinke
    Zhan, Shu
    PATTERN RECOGNITION LETTERS, 2021, 147 : 1 - 7
  • [32] TEXT TO IMAGE SYNTHESIS WITH BIDIRECTIONAL GENERATIVE ADVERSARIAL NETWORK
    Wang, Zixu
    Quan, Zhe
    Wang, Zhi-Jie
    Hu, Xinjian
    Chen, Yangyang
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [33] Deep generative adversarial networks for infrared image enhancement
    Guei, Axel-Christian
    Akhloufi, Moulay A.
    THERMOSENSE: THERMAL INFRARED APPLICATIONS XL, 2018, 10661
  • [34] Underwater Attentional Generative Adversarial Networks for Image Enhancement
    Wang, Ning
    Chen, Tingkai
    Kong, Xiangjun
    Chen, Yanzheng
    Wang, Rongfeng
    Gong, Yongjun
    Song, Shiji
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2023, 53 (03) : 490 - 500
  • [35] MAGAN: Multi-attention Generative Adversarial Networks for Text-to-Image Generation
    Jia, Xibin
    Mi, Qing
    Dai, Qi
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 312 - 322
  • [36] Multi-granularity Prediction for Scene Text Recognition
    Wang, Peng
    Da, Cheng
    Yao, Cong
    COMPUTER VISION - ECCV 2022, PT XXVIII, 2022, 13688 : 339 - 355
  • [37] Photoacoustic image synthesis with generative adversarial networks
    Schellenberg, Melanie
    Groehl, Janek
    Dreher, Kris K.
    Noelke, Jan-Hinrich
    Holzwarth, Niklas
    Tizabi, Minu D.
    Seitel, Alexander
    Maier-Hein, Lena
    PHOTOACOUSTICS, 2022, 28
  • [38] Enhanced Magnetic Resonance Image Synthesis with Contrast-Aware Generative Adversarial Networks
    Denck, Jonas
    Guehring, Jens
    Maier, Andreas
    Rothgang, Eva
    JOURNAL OF IMAGING, 2021, 7 (08)
  • [39] GMF-GAN: Gradual multi-granularity semantic fusion GAN for text-to-image synthesis
    Jin, Dehu
    Li, Guangju
    Yu, Qi
    Yu, Lan
    Cui, Jia
    Qi, Meng
    DIGITAL SIGNAL PROCESSING, 2023, 140
  • [40] Structure Aware Generative Adversarial Networks for Hyperspectral Image Classification
    Alipour-Fard, Tayeb
    Arefi, Hossein
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 5424 - 5438