ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation

被引:0
|
作者
Jha, Akshita [1 ,2 ]
Prabhakaran, Vinodkumar [2 ]
Denton, Remi [2 ]
Laszlo, Sarah [2 ]
Dave, Shachi [2 ]
Qadri, Rida [2 ]
Reddy, Chandan K. [1 ]
Dev, Sunipa [2 ]
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
[2] Google Res, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent studies have shown that Text-toImage (T2I) model generations can reflect social stereotypes present in the real world. However, existing approaches for evaluating stereotypes have a noticeable lack of coverage of global identity groups and their associated stereotypes. To address this gap, we introduce the ViSAGe (Visual Stereotypes Around the Globe) dataset to enable evaluation of known nationality-based stereotypes in T2I models, across 135 nationalities. We enrich an existing textual stereotype resource by distinguishing between stereotypical associations that are more likely to have visual depictions, such as 'sombrero', from those that are less visually concrete, such as 'attractive'. We demonstrate ViSAGe's utility through a multi-faceted evaluation of T2I generations. First, we show that stereotypical attributes in ViSAGe are thrice as likely to be present in generated images of corresponding identities as compared to other attributes, and that the offensiveness of these depictions is especially higher for identities from Africa, South America, and South East Asia. Second, we assess the stereotypical pull of visual depictions of identity groups, which reveals how the 'default' representations of all identity groups in ViSAGe have a pull towards stereotypical depictions, and that this pull is even more prominent for identity groups from the Global South. CONTENT WARNING: Some examples contain offensive stereotypes.
引用
收藏
页码:12333 / 12347
页数:15
相关论文
共 50 条
  • [31] HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
    Narasimhaswamy, Supreeth
    Bhattacharya, Uttaran
    Chen, Xiang
    Dasgupta, Ishita
    Mitra, Saayan
    Hoai, Minh
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 2468 - 2479
  • [32] Attribute-Centric Compositional Text-to-Image Generation
    Cong, Yuren
    Min, Martin Renqiang
    Li, Li Erran
    Rosenhahn, Bodo
    Yang, Michael Ying
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [33] Using text-to-image generation for architectural design ideation
    Paananen, Ville
    Oppenlaender, Jonas
    Visuri, Aku
    INTERNATIONAL JOURNAL OF ARCHITECTURAL COMPUTING, 2024, 22 (03) : 458 - 474
  • [34] No-reference Quality Assessment of Text-to-Image Generation
    Huang, Haitao
    Jia, Rongli
    Zhang, Yuhong
    Xie, Rong
    Song, Li
    Li, Lin
    Feng, Yanan
    19TH IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING, BMSB 2024, 2024, : 357 - 362
  • [35] CogView: Mastering Text-to-Image Generation via Transformers
    Ding, Ming
    Yang, Zhuoyi
    Hong, Wenyi
    Zheng, Wendi
    Zhou, Chang
    Yin, Da
    Lin, Junyang
    Zou, Xu
    Shao, Zhou
    Yang, Hongxia
    Tang, Jie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [36] Latent Guard: A Safety Framework for Text-to-Image Generation
    Liu, Runtao
    Khakzar, Ashkan
    Gu, Jindong
    Chen, Qifeng
    Torr, Philip
    Pizzati, Fabio
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 93 - 109
  • [37] Improving text-to-image generation with object layout guidance
    Zakraoui, Jezia
    Saleh, Moutaz
    Al-Maadeed, Somaya
    Jaam, Jihad Mohammed
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (18) : 27423 - 27443
  • [38] ReCo: Region-Controlled Text-to-Image Generation
    Yang, Zhengyuan
    Wang, Jianfeng
    Gan, Zhe
    Li, Linjie
    Lin, Kevin
    Wu, Chenfei
    Duan, Nan
    Liu, Zicheng
    Liu, Ce
    Zeng, Michael
    Wang, Lijuan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14246 - 14255
  • [39] MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
    Zhao, Yang
    Xu, Yanwu
    Xiao, Zhisheng
    Jia, Haolin
    Hou, Tingbo
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 225 - 242
  • [40] Social Biases through the Text-to-Image Generation Lens
    Naik, Ranjita
    Nushi, Besmira
    PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 786 - 808