Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories

被引:0
|
作者
Chaloner, Kaytlin [1 ]
Maldonado, Alfredo [1 ]
机构
[1] Trinity Coll Dublin, ADAPT Ctr, SCSS, Dublin, Ireland
来源
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019) | 2019年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prior work has shown that word embeddings capture human stereotypes, including gender bias. However, there is a lack of studies testing the presence of specific gender bias categories in word embeddings across diverse domains. This paper aims to fill this gap by applying the WEAT bias detection method to four sets of word embeddings trained on corpora from four different domains: news, social networking, biomedical and a gender-balanced corpus extracted from Wikipedia (GAP). We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings. We detect some gender bias in GAP. We also propose a simple but novel method for discovering new bias categories by clustering word embeddings. We validate this method through WEAT's hypothesis testing mechanism and find it useful for expanding the relatively small set of well-known gender bias word categories commonly used in the literature.
引用
收藏
页码:25 / 32
页数:8
相关论文
共 50 条
  • [21] A World Full of Stereotypes? Further Investigation on Origin and Gender Bias in Multi-Lingual Word Embeddings
    Kurpicz-Briki, Mascha
    Leoni, Tomaso
    FRONTIERS IN BIG DATA, 2021, 4
  • [22] Studying Political Bias via Word Embeddings
    Gordon, Joshua
    Babaeianjelodar, Marzieh
    Matthews, Jeanna
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 760 - 764
  • [23] Effect of dimensionality change on the bias of word embeddings
    Rai, Rohit Raj
    Awekar, Amit
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 601 - 602
  • [24] Quantifying and Debiasing Gender Bias in Japanese Gender-specific Words with Word Embedding
    Chen, Leisi
    Sugimoto, Toru
    2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
  • [25] Measuring Gender Bias inWord Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals
    Sabbaghi, Shiva Omrani
    Caliskan, Aylin
    PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 518 - 531
  • [26] Measuring Bias in Contextualized Word Representations
    Kurita, Keita
    Vyas, Nidhi
    Pareek, Ayush
    Black, Alan W.
    Tsvetkov, Yulia
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 166 - 172
  • [27] Word embeddings are biased. But whose bias are they reflecting?
    Petreski, Davor
    Hashim, Ibrahim C.
    AI & SOCIETY, 2023, 38 (02) : 975 - 982
  • [28] Identifying and Reducing Gender Bias in Word-Level Language Models
    Bordia, Shikha
    Bowman, Samuel R.
    NAACL HLT 2019: THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2019, : 7 - 15
  • [29] How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation
    Gaido, Marco
    Savoldi, Beatrice
    Bentivogli, Luisa
    Negri, Matteo
    Turchi, Marco
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3576 - 3589
  • [30] A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations
    Yang, Zekun
    Feng, Juan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9434 - 9441