Gender Biases in Automatic Evaluation Metrics for Image Captioning

被引：0

作者：

Qiu, Haoyi ^{[1
]}

Dou, Zi-Yi ^{[1
]}

Wang, Tianlu ^{[2
]}

Celikyilmaz, Asli ^{[2
]}

Peng, Nanyun ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Los Angeles, CA 90024 USA

[2] Meta AI Res, Menlo Pk, CA USA

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Model-based evaluation metrics (e.g., CLIPScore and GPTScore) have demonstrated decent correlations with human judgments in various language generation tasks. However, their impact on fairness remains largely unexplored. It is widely recognized that pretrained models can inadvertently encode societal biases, thus employing these models for evaluation purposes may inadvertently perpetuate and amplify biases. For example, an evaluation metric may favor the caption "a woman is calculating an account book" over "a man is calculating an account book," even if the image only shows male accountants. In this paper, we conduct a systematic study of gender biases in modelbased automatic evaluation metrics for image captioning tasks. We start by curating a dataset comprising profession, activity, and object concepts associated with stereotypical gender associations. Then, we demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations, as well as the propagation of biases to generation models through reinforcement learning. Finally, we present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments. Our dataset and framework lay the foundation for understanding the potential harm of model-based evaluation metrics, and facilitate future works to develop more inclusive evaluation metrics.(1)

引用

页码：8358 / 8375

页数：18

共 50 条

[31] Learning Contextual Metrics for Automatic Image Annotation
Liu, Zuotao
Zhou, Xiangdong
Xiang, Yu
Zheng, Yan-Tao
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT I, 2010, 6297 : 124 - +
[32] Vocabulary Learning Support System Based on Automatic Image Captioning Technology
Hasnine, Mohammad Nehal
Flanagan, Brendan
Akcapinar, Gokhan
Ogata, Hiroaki
Mouri, Kousuke
Uosaki, Noriko
DISTRIBUTED, AMBIENT AND PERVASIVE INTERACTIONS, 2019, 11587 : 346 - 358
[33] An Evaluation of Image Quality Metrics
J Photogr Sci, 1 (07):
[34] Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
Hossain, Md Anwar
Hasan, Mirza A. F. M. Rashidul
Hossen, Ebrahim
Asraful, Md
Faruk, Md Omar
Abadin, A. F. M. Zainul
Ali, Md Suhag
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1110 - 1117
[35] AN EVALUATION OF IMAGE QUALITY METRICS
JACOBSON, RE
JOURNAL OF PHOTOGRAPHIC SCIENCE, 1995, 43 (01): : 7 - 16
[36] A Survey on Automatic Image Captioning Approaches: Contemporary Trends and Future Perspectives
Salgotra, Garima
Abrol, Pawanesh
Selwal, Arvind
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2024, : 1459 - 1497
[37] Contrastive semantic similarity learning for image captioning evaluation
Zeng, Chao
Kwong, Sam
Zhao, Tiesong
Wang, Hanli
INFORMATION SCIENCES, 2022, 609 : 913 - 930
[38] Evaluating Gender-Neutral Training Data for Automated Image Captioning
Amend, Jack J.
Wazzan, Albatool
Souvenir, Richard
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 1226 - 1235
[39] A comprehensive literature review on image captioning methods and metrics based on deep learning technique
Ahmad Sami Al-Shamayleh
Omar Adwan
Mohammad A. Alsharaiah
Abdelrahman H. Hussein
Qasem M. Kharma
Christopher Ifeanyi Eke
Multimedia Tools and Applications, 2024, 83 : 34219 - 34268
[40] A comprehensive literature review on image captioning methods and metrics based on deep learning technique
Al-Shamayleh, Ahmad Sami
Adwan, Omar
Alsharaiah, Mohammad A.
Hussein, Abdelrahman H.
Kharma, Qasem M.
Eke, Christopher Ifeanyi
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (12) : 34219 - 34268

← 1 2 3 4 5 →