Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories

被引:0
|
作者
Chaloner, Kaytlin [1 ]
Maldonado, Alfredo [1 ]
机构
[1] Trinity Coll Dublin, ADAPT Ctr, SCSS, Dublin, Ireland
来源
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019) | 2019年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prior work has shown that word embeddings capture human stereotypes, including gender bias. However, there is a lack of studies testing the presence of specific gender bias categories in word embeddings across diverse domains. This paper aims to fill this gap by applying the WEAT bias detection method to four sets of word embeddings trained on corpora from four different domains: news, social networking, biomedical and a gender-balanced corpus extracted from Wikipedia (GAP). We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings. We detect some gender bias in GAP. We also propose a simple but novel method for discovering new bias categories by clustering word embeddings. We validate this method through WEAT's hypothesis testing mechanism and find it useful for expanding the relatively small set of well-known gender bias word categories commonly used in the literature.
引用
收藏
页码:25 / 32
页数:8
相关论文
共 50 条
  • [1] Gender Bias in Contextualized Word Embeddings
    Zhao, Jieyu
    Wangt, Tianlu
    Yatskart, Mark
    Cotterell, Ryan
    Ordonezt, Vicente
    Chang, Kai-Wei
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 629 - 634
  • [2] Investigation of Gender Bias in Turkish Word Embeddings
    Sevim, Nurullah
    Koc, Aykut
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [3] Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
    Basta, Christine
    Costa-jussa, Marta R.
    Casas, Noe
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 33 - 39
  • [4] Extensive study on the underlying gender bias in contextualized word embeddings
    Christine Basta
    Marta R. Costa-jussà
    Noe Casas
    Neural Computing and Applications, 2021, 33 : 3371 - 3384
  • [5] Detecting gender bias in Arabic text through word embeddings
    Mourad, Aya
    Abu Salem, Fatima K.
    Elbassuoni, Shady
    PLOS ONE, 2025, 20 (03):
  • [6] Extensive study on the underlying gender bias in contextualized word embeddings
    Basta, Christine
    Costa-jussa, Marta R.
    Casas, Noe
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3371 - 3384
  • [7] Iterative Adversarial Removal of Gender Bias in Pretrained Word Embeddings
    Gaci, Yacine
    Benatallah, Boualem
    Casati, Fabio
    Benabdeslem, Khalid
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 829 - 836
  • [8] Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings
    Rios, Anthony
    Joshi, Reenam
    Shin, Hejin
    19TH SIGBIOMED WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2020), 2020, : 1 - 13
  • [9] Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
    Caliskan, Aylin
    Ajay, Pimparkar Parth
    Charlesworth, Tessa
    Wolfe, Robert
    Banaji, Mahzarin R.
    PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 156 - 170
  • [10] Equalizing Gender Bias in Neural Machine Translation with Word Embeddings Techniques
    Font, Joel Escude
    Costa-jussa, Marta R.
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 147 - 154