Semantics derived automatically from language corpora contain human-like biases

被引:1598
作者
Caliskan, Aylin [1 ]
Bryson, Joanna J. [1 ,2 ]
Narayanan, Arvind [1 ]
机构
[1] Princeton Univ, Ctr Informat Technol Policy, Princeton, NJ 08544 USA
[2] Univ Bath, Dept Comp Sci, Bath BA2 7AY, Avon, England
关键词
IMPLICIT; STEREOTYPES; ATTITUDES;
D O I
10.1126/science.aal4230
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Machine learning is a means to derive artificial intelligence by discovering patterns in existing data. Here, we show that applying machine learning to ordinary human language results in human-like semantic biases. We replicated a spectrum of known biases, as measured by the Implicit Association Test, using a widely used, purely statistical machine-learning model trained on a standard corpus of text from the World Wide Web. Our results indicate that text corpora contain recoverable and accurate imprints of our historic biases, whether morally neutral as toward insects or flowers, problematic as toward race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names. Our methods hold promise for identifying and addressing sources of bias in culture, including technology.
引用
收藏
页码:183 / 186
页数:4
相关论文
共 23 条
[1]  
Barocas S., 2014, CALIF LAW REV, V104
[2]   Implicit discrimination [J].
Bertrand, M ;
Chugh, D ;
Mullainathan, S .
AMERICAN ECONOMIC REVIEW, 2005, 95 (02) :94-98
[3]   Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination [J].
Bertrand, M ;
Mullainathan, S .
AMERICAN ECONOMIC REVIEW, 2004, 94 (04) :991-1013
[4]  
Bishop C., 2006, Pattern recognition and machine learning, P423
[5]  
Bolukbasi T, 2016, ADV NEUR IN, V29
[6]   Extracting semantic representations from word co-occurrence statistics: A computational study [J].
Bullinaria, John A. ;
Levy, Joseph P. .
BEHAVIOR RESEARCH METHODS, 2007, 39 (03) :510-526
[7]  
Dwork C., 2012, Proceedings of the 3rd Conference on Innovations in Theoretical Computer Science, P214, DOI DOI 10.1145/2090236.2090255
[8]   Certifying and Removing Disparate Impact [J].
Feldman, Michael ;
Friedler, Sorelle A. ;
Moeller, John ;
Scheidegger, Carlos ;
Venkatasubramanian, Suresh .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :259-268
[9]   Measuring individual differences in implicit cognition: The implicit association test [J].
Greenwald, AG ;
McGhee, DE ;
Schwartz, JLK .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1998, 74 (06) :1464-1480
[10]  
Hanheide M., 2015, ARTIF INTELL, V2015