ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation

被引:0
|
作者
Jha, Akshita [1 ,2 ]
Prabhakaran, Vinodkumar [2 ]
Denton, Remi [2 ]
Laszlo, Sarah [2 ]
Dave, Shachi [2 ]
Qadri, Rida [2 ]
Reddy, Chandan K. [1 ]
Dev, Sunipa [2 ]
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
[2] Google Res, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent studies have shown that Text-toImage (T2I) model generations can reflect social stereotypes present in the real world. However, existing approaches for evaluating stereotypes have a noticeable lack of coverage of global identity groups and their associated stereotypes. To address this gap, we introduce the ViSAGe (Visual Stereotypes Around the Globe) dataset to enable evaluation of known nationality-based stereotypes in T2I models, across 135 nationalities. We enrich an existing textual stereotype resource by distinguishing between stereotypical associations that are more likely to have visual depictions, such as 'sombrero', from those that are less visually concrete, such as 'attractive'. We demonstrate ViSAGe's utility through a multi-faceted evaluation of T2I generations. First, we show that stereotypical attributes in ViSAGe are thrice as likely to be present in generated images of corresponding identities as compared to other attributes, and that the offensiveness of these depictions is especially higher for identities from Africa, South America, and South East Asia. Second, we assess the stereotypical pull of visual depictions of identity groups, which reveals how the 'default' representations of all identity groups in ViSAGe have a pull towards stereotypical depictions, and that this pull is even more prominent for identity groups from the Global South. CONTENT WARNING: Some examples contain offensive stereotypes.
引用
收藏
页码:12333 / 12347
页数:15
相关论文
共 50 条
  • [1] Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
    Bianchi, Federico
    Kalluri, Pratyusha
    Durmus, Esin
    Ladhak, Faisal
    Cheng, Myra
    Nozza, Debora
    Hashimoto, Tatsunori
    Jurafsky, Dan
    Zou, James
    Caliskan, Aylin
    PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023, 2023, : 1493 - 1504
  • [2] Visual Programming for Text-to-Image Generation and Evaluation
    Cho, Jaemin
    Zala, Abhay
    Bansal, Mohit
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works
    Ko, Hyung-Kwon
    Park, Gwanmo
    Jeon, Hyeon
    Jo, Jaemin
    Kim, Juho
    Seo, Jinwook
    PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023, 2023, : 919 - 933
  • [4] Controllable Text-to-Image Generation
    Li, Bowen
    Qi, Xiaojuan
    Lukasiewicz, Thomas
    Torr, Philip H. S.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] Surgical text-to-image generation
    Nwoye, Chinedu Innocent
    Bose, Rupak
    Elgohary, Kareem
    Arboit, Lorenzo
    Carlino, Giorgio
    Lavanchy, Joel L.
    Mascagni, Pietro
    Padoy, Nicolas
    PATTERN RECOGNITION LETTERS, 2025, 190 : 73 - 80
  • [6] Expressive Text-to-Image Generation with Rich Text
    Ge, Songwei
    Park, Taesung
    Zhu, Jun-Yan
    Huang, Jia-Bin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7511 - 7522
  • [7] Visual question answering based evaluation metrics for text-to-image generation
    Miyamoto, Mizuki
    Morita, Ryugo
    Zhou, Jinjia
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [8] SEMANTICALLY INVARIANT TEXT-TO-IMAGE GENERATION
    Sah, Shagan
    Peri, Dheeraj
    Shringi, Ameya
    Zhang, Chi
    Dominguez, Miguel
    Savakis, Andreas
    Ptucha, Ray
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3783 - 3787
  • [9] Semantics Disentangling for Text-to-Image Generation
    Yin, Guojun
    Liu, Bin
    Sheng, Lu
    Yu, Nenghai
    Wang, Xiaogang
    Shao, Jing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2322 - 2331
  • [10] Text-to-Image Generation for Abstract Concepts
    Liao, Jiayi
    Chen, Xu
    Fu, Qiang
    Du, Lun
    He, Xiangnan
    Wang, Xiang
    Han, Shi
    Zhang, Dongmei
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 3360 - 3368