Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories

被引：0

作者：

Chaloner, Kaytlin ^{[1
]}

Maldonado, Alfredo ^{[1
]}

机构：

[1] Trinity Coll Dublin, ADAPT Ctr, SCSS, Dublin, Ireland

来源：

GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Prior work has shown that word embeddings capture human stereotypes, including gender bias. However, there is a lack of studies testing the presence of specific gender bias categories in word embeddings across diverse domains. This paper aims to fill this gap by applying the WEAT bias detection method to four sets of word embeddings trained on corpora from four different domains: news, social networking, biomedical and a gender-balanced corpus extracted from Wikipedia (GAP). We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings. We detect some gender bias in GAP. We also propose a simple but novel method for discovering new bias categories by clustering word embeddings. We validate this method through WEAT's hypothesis testing mechanism and find it useful for expanding the relatively small set of well-known gender bias word categories commonly used in the literature.

引用

页码：25 / 32

页数：8

共 50 条

[21] A World Full of Stereotypes? Further Investigation on Origin and Gender Bias in Multi-Lingual Word Embeddings
Kurpicz-Briki, Mascha
Leoni, Tomaso
FRONTIERS IN BIG DATA, 2021, 4
[22] Studying Political Bias via Word Embeddings
Gordon, Joshua
Babaeianjelodar, Marzieh
Matthews, Jeanna
WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 760 - 764
[23] Effect of dimensionality change on the bias of word embeddings
Rai, Rohit Raj
Awekar, Amit
PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 601 - 602
[24] Quantifying and Debiasing Gender Bias in Japanese Gender-specific Words with Word Embedding
Chen, Leisi
Sugimoto, Toru
2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
[25] Measuring Gender Bias inWord Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals
Sabbaghi, Shiva Omrani
Caliskan, Aylin
PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 518 - 531
[26] Measuring Bias in Contextualized Word Representations
Kurita, Keita
Vyas, Nidhi
Pareek, Ayush
Black, Alan W.
Tsvetkov, Yulia
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 166 - 172
[27] Word embeddings are biased. But whose bias are they reflecting?
Petreski, Davor
Hashim, Ibrahim C.
AI & SOCIETY, 2023, 38 (02) : 975 - 982
[28] Identifying and Reducing Gender Bias in Word-Level Language Models
Bordia, Shikha
Bowman, Samuel R.
NAACL HLT 2019: THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2019, : 7 - 15
[29] How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation
Gaido, Marco
Savoldi, Beatrice
Bentivogli, Luisa
Negri, Matteo
Turchi, Marco
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3576 - 3589
[30] A Causal Inference Method for Reducing Gender Bias in Word Embedding Relations
Yang, Zekun
Feng, Juan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9434 - 9441

← 1 2 3 4 5 →