Text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks

被引:0
|
作者
Dong, Pei [1 ]
Wu, Lei [1 ]
Li, Ruichen [1 ]
Meng, Xiangxu [1 ]
Meng, Lei [1 ]
机构
[1] Shandong Univ, Sch Software, 1500 ShunHua Rd High Tech Ind Dev Zone, Jinan 250101, Peoples R China
关键词
Generative adversarial network; Multi-granularity feature aware enhancement; Text-to-image; Autoregressive; Diffusion;
D O I
10.1016/j.cviu.2024.104042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Synthesizing complex images from text presents challenging. Compared to autoregressive and diffusion modelbased methods, Generative Adversarial Network -based methods have significant advantages in terms of computational cost and generation efficiency yet remain two limitations: first, these methods often refine all features output from the previous stage indiscriminately, without considering these features are initialized gradually during the generation process; second, the sparse semantic constraints provided by the text description are typically ineffective for refining fine-grained features. These issues complicate the balance between generation quality, computational cost and inference speed. To address these issues, we propose a Multi -granularity Feature Aware Enhancement GAN (MFAE-GAN), which allows the refinement process to match the order of different granularity features being initialized. Specifically, MFAE-GAN (1) samples category -related coarse -grained features and instance -level detail -related fine-grained features at different generation stages based on different attention mechanisms in Coarse -grained Feature Enhancement (CFE) and Fine-grained Feature Enhancement (FFE) to guide the generation process spatially, (2) provides denser semantic constraints than textual semantic information through Multi -granularity Features Adaptive Batch Normalization (MFA-BN) in the process of refining fine-grained features, and (3) adopts a Global Semantics Preservation (GSP) to avoid the loss of global semantics when sampling features continuously. Extensive experimental results demonstrate that our MFAE-GAN is competitive in terms of both image generation quality and efficiency.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
    Zhang, Han
    Xu, Tao
    Li, Hongsheng
    Zhang, Shaoting
    Wang, Xiaogang
    Huang, Xiaolei
    Metaxas, Dimitris
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5908 - 5916
  • [42] MGF-GAN: Multi Granularity Text Feature Fusion for Text-guided-Image Synthesis
    Wang, Xingfu
    Li, Xiangyu
    Hawbani, Ammar
    Zhao, Liang
    Alsamhi, Saeed Hamood
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1398 - 1403
  • [43] Hybrid Attention Driven Text-to-Image Synthesis via Generative Adversarial Networks
    Cheng, Qingrong
    Gu, Xiaodong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 : 483 - 495
  • [44] Generative adversarial networks with multi-scale and attention mechanisms for underwater image enhancement
    Wang, Ziyang
    Zhao, Liquan
    Zhong, Tie
    Jia, Yanfei
    Cui, Ying
    FRONTIERS IN MARINE SCIENCE, 2023, 10
  • [45] Multi-Granularity Feature Fusion for Image-Guided Story Ending Generation
    Li, Pijian
    Huang, Qingbao
    Li, Zhigang
    Cai, Yi
    Shuang, Feng
    Li, Qing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3437 - 3449
  • [46] Speech Enhancement with Multi-granularity Vector Quantization
    Zhao, Xiaoying
    Zhu, Qiushi
    Zhang, Jie
    Zhou, Yeping
    Liu, Peiqi
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1937 - 1942
  • [47] FGGAN: Feature-Guiding Generative Adversarial Networks for Text Generation
    Yang, Yang
    Dan, Xiaodong
    Qiu, Xuesong
    Gao, Zhipeng
    IEEE ACCESS, 2020, 8 (08): : 105217 - 105225
  • [48] Image Synthesis in Multi-Contrast MRI With Conditional Generative Adversarial Networks
    Dar, Salman U. H.
    Yurt, Mahmut
    Karacan, Levent
    Erdem, Aykut
    Erdem, Erkut
    Cukur, Tolga
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (10) : 2375 - 2388
  • [49] mustGAN: multi-stream Generative Adversarial Networks for MR Image Synthesis
    Yurt, Mahmut
    Dar, Salman U. H.
    Erdem, Aykut
    Erdem, Erkut
    Oguz, Kader K.
    Cukur, Tolga
    MEDICAL IMAGE ANALYSIS, 2021, 70
  • [50] Multi-Instance Sketch to Image Synthesis With Progressive Generative Adversarial Networks
    Wang, Zhi-Hui
    Wang, Ning
    Shi, Jian
    Li, Jian-Jun
    Yang, Hairui
    IEEE ACCESS, 2019, 7 : 56683 - 56693