Attribute-Centric Compositional Text-to-Image Generation

被引:0
|
作者
Cong, Yuren [1 ]
Min, Martin Renqiang [2 ]
Li, Li Erran [3 ]
Rosenhahn, Bodo [1 ]
Yang, Michael Ying [4 ]
机构
[1] Leibniz Univ Hannover, Inst Informat Proc, Hannover, Germany
[2] NEC Labs Amer, Princeton, NJ USA
[3] Amazon, AWS AI, San Francisco, CA USA
[4] Univ Bath, Visual Comp Grp, Bath, England
关键词
Text-to-image; Compositional generation; Attribute-centric;
D O I
10.1007/s11263-025-02371-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the recent impressive breakthroughs in text-to-image generation, generative models have difficulty in capturing the data distribution of underrepresented attribute compositions while over-memorizing overrepresented attribute compositions, which raises public concerns about their robustness and fairness. To tackle this challenge, we propose ACTIG, an attribute-centric compositional text-to-image generation framework. We present an attribute-centric feature augmentation and a novel image-free training scheme, which greatly improves model's ability to generate images with underrepresented attributes. We further propose an attribute-centric contrastive loss to avoid overfitting to overrepresented attribute compositions. We validate our framework on the CelebA-HQ and CUB datasets. Extensive experiments show that the compositional generalization of ACTIG is outstanding, and our framework outperforms previous works in terms of image quality and text-image consistency. The source code and trained models are publicly available at https://github.com/yrcong/ACTIG.
引用
收藏
页数:16
相关论文
共 50 条
  • [11] Perceptions and Realities of Text-to-Image Generation
    Oppenlaender, Jonas
    Silvennoinen, Johanna
    Paananen, Ville
    Visuri, Aku
    PROCEEDINGS OF THE 26TH INTERNATIONAL ACADEMIC MINDTREK, MINDTREK 2023, 2023, : 279 - 288
  • [12] Optimizing Prompts for Text-to-Image Generation
    Hao, Yaru
    Chi, Zewen
    Dong, Li
    Wei, Furu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [13] Attribute-Centric Recognition for Cross-category Generalization
    Farhadi, Ali
    Endres, Ian
    Hoiem, Derek
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 2352 - 2359
  • [14] Discrimination rate: an attribute-centric metric to measure privacy
    Sondeck, Louis Philippe
    Laurent, Maryline
    Frey, Vincent
    ANNALS OF TELECOMMUNICATIONS, 2017, 72 (11-12) : 755 - 766
  • [15] Prompt Refinement with Image Pivot for Text-to-Image Generation
    Zhan, Jingtao
    Ai, Qingyao
    Liu, Yiqun
    Pan, Yingwei
    Yao, Ting
    Mao, Jiaxin
    Ma, Shaoping
    Mei, Tao
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 941 - 954
  • [16] Development and Classification of Image Dataset for Text-to-Image Generation
    Kumar M.
    Mittal M.
    Singh S.
    Journal of The Institution of Engineers (India): Series B, 2024, 105 (04) : 787 - 796
  • [17] Discrimination rate: an attribute-centric metric to measure privacy
    Louis Philippe Sondeck
    Maryline Laurent
    Vincent Frey
    Annals of Telecommunications, 2017, 72 : 755 - 766
  • [18] Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models
    Liu, Nan
    Du, Yilun
    Li, Shuang
    Tenenbaum, Joshua B.
    Torralba, Antonio
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2085 - 2095
  • [19] Visual Programming for Text-to-Image Generation and Evaluation
    Cho, Jaemin
    Zala, Abhay
    Bansal, Mohit
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [20] Zero-Shot Text-to-Image Generation
    Ramesh, Aditya
    Pavlov, Mikhail
    Goh, Gabriel
    Gray, Scott
    Voss, Chelsea
    Radford, Alec
    Chen, Mark
    Sutskever, Ilya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139