Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories

被引：0

作者：

Chaloner, Kaytlin ^{[1
]}

Maldonado, Alfredo ^{[1
]}

机构：

[1] Trinity Coll Dublin, ADAPT Ctr, SCSS, Dublin, Ireland

来源：

GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Prior work has shown that word embeddings capture human stereotypes, including gender bias. However, there is a lack of studies testing the presence of specific gender bias categories in word embeddings across diverse domains. This paper aims to fill this gap by applying the WEAT bias detection method to four sets of word embeddings trained on corpora from four different domains: news, social networking, biomedical and a gender-balanced corpus extracted from Wikipedia (GAP). We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings. We detect some gender bias in GAP. We also propose a simple but novel method for discovering new bias categories by clustering word embeddings. We validate this method through WEAT's hypothesis testing mechanism and find it useful for expanding the relatively small set of well-known gender bias word categories commonly used in the literature.

引用

页码：25 / 32

页数：8

共 50 条

[1] Gender Bias in Contextualized Word Embeddings
Zhao, Jieyu
Wangt, Tianlu
Yatskart, Mark
Cotterell, Ryan
Ordonezt, Vicente
Chang, Kai-Wei
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 629 - 634
[2] Investigation of Gender Bias in Turkish Word Embeddings
Sevim, Nurullah
Koc, Aykut
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[3] Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
Basta, Christine
Costa-jussa, Marta R.
Casas, Noe
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 33 - 39
[4] Extensive study on the underlying gender bias in contextualized word embeddings
Christine Basta
Marta R. Costa-jussà
Noe Casas
Neural Computing and Applications, 2021, 33 : 3371 - 3384
[5] Detecting gender bias in Arabic text through word embeddings
Mourad, Aya
Abu Salem, Fatima K.
Elbassuoni, Shady
PLOS ONE, 2025, 20 (03):
[6] Extensive study on the underlying gender bias in contextualized word embeddings
Basta, Christine
Costa-jussa, Marta R.
Casas, Noe
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3371 - 3384
[7] Iterative Adversarial Removal of Gender Bias in Pretrained Word Embeddings
Gaci, Yacine
Benatallah, Boualem
Casati, Fabio
Benabdeslem, Khalid
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 829 - 836
[8] Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings
Rios, Anthony
Joshi, Reenam
Shin, Hejin
19TH SIGBIOMED WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2020), 2020, : 1 - 13
[9] Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
Caliskan, Aylin
Ajay, Pimparkar Parth
Charlesworth, Tessa
Wolfe, Robert
Banaji, Mahzarin R.
PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 156 - 170
[10] Equalizing Gender Bias in Neural Machine Translation with Word Embeddings Techniques
Font, Joel Escude
Costa-jussa, Marta R.
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 147 - 154

← 1 2 3 4 5 →