ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation

被引：0

作者：

Jha, Akshita ^{[1
,2
]}

Prabhakaran, Vinodkumar ^{[2
]}

Denton, Remi ^{[2
]}

Laszlo, Sarah ^{[2
]}

Dave, Shachi ^{[2
]}

Qadri, Rida ^{[2
]}

Reddy, Chandan K. ^{[1
]}

Dev, Sunipa ^{[2
]}

机构：

[1] Virginia Tech, Blacksburg, VA 24061 USA

[2] Google Res, Mountain View, CA USA

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recent studies have shown that Text-toImage (T2I) model generations can reflect social stereotypes present in the real world. However, existing approaches for evaluating stereotypes have a noticeable lack of coverage of global identity groups and their associated stereotypes. To address this gap, we introduce the ViSAGe (Visual Stereotypes Around the Globe) dataset to enable evaluation of known nationality-based stereotypes in T2I models, across 135 nationalities. We enrich an existing textual stereotype resource by distinguishing between stereotypical associations that are more likely to have visual depictions, such as 'sombrero', from those that are less visually concrete, such as 'attractive'. We demonstrate ViSAGe's utility through a multi-faceted evaluation of T2I generations. First, we show that stereotypical attributes in ViSAGe are thrice as likely to be present in generated images of corresponding identities as compared to other attributes, and that the offensiveness of these depictions is especially higher for identities from Africa, South America, and South East Asia. Second, we assess the stereotypical pull of visual depictions of identity groups, which reveals how the 'default' representations of all identity groups in ViSAGe have a pull towards stereotypical depictions, and that this pull is even more prominent for identity groups from the Global South. CONTENT WARNING: Some examples contain offensive stereotypes.

引用

页码：12333 / 12347

页数：15

共 50 条

[41] HARIVO: Harnessing Text-to-Image Models for Video Generation
Kwon, Mingi
Oh, Seoung Wug
Zhou, Yang
Liu, Difan
Lee, Joon-Young
Cai, Haoran
Liu, Baqiao
Liu, Feng
Uh, Youngjung
COMPUTER VISION - ECCV 2024, PT LIII, 2025, 15111 : 19 - 36
[42] ITI- GEN: Inclusive Text-to-Image Generation
Zhang, Cheng
Chen, Xuanbai
Chai, Siqi
Wu, Chen Henry
Lagun, Dmitry
Beeler, Thabo
De la Torre, Fernando
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3946 - 3957
[43] Translation-Enhanced Multilingual Text-to-Image Generation
Li, Yaoyiran
Chang, Ching-Yun
Rawls, Stephen
Vulic, Ivan
Korhonen, Anna
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9174 - 9193
[44] Training-Free Consistent Text-to-Image Generation
Tewel, Yoad
Kaduri, Omri
Gal, Rinon
Kasten, Yoni
Wolf, Lior
Chechik, Gal
Atzmon, Yuval
ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
[45] Text-to-image generation combined with mutual information maximization
Mo J.
Xu K.
Lin L.
Ouyang N.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 180 - 188
[46] EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models
Yang, Jingyuan
Feng, Jiawei
Huang, Hui
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6358 - 6368
[47] Locally controllable network based on visual-linguistic relation alignment for text-to-image generation
Li, Zaike
Liu, Li
Zhang, Huaxiang
Liu, Dongmei
Song, Yu
Li, Boqun
MULTIMEDIA SYSTEMS, 2024, 30 (01)
[48] Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation
Chen, Zhuowei
Mao, Zhendong
Fang, Shancheng
Hu, Bo
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4327 - 4335
[49] GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image Generation
Gong, Jingzhi
Li, Sisi
D'Aloisio, Giordano
Ding, Zishuo
Ye, Yulong
Langdon, William B.
Sarro, Federica
SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2024, 2024, 14767 : 70 - 76
[50] UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Xu, Yanwu
Zhao, Yang
Xiao, Zhisheng
Hou, Tingbo
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 8196 - 8206

← 1 2 3 4 5 →