BiasAsker: Measuring the Bias in Conversational AI System

被引:12
|
作者
Wan, Yuxuan [1 ]
Wang, Wenxuan [1 ]
He, Pinjia [2 ]
Gu, Jiazhen [1 ]
Bai, Haonan [1 ]
Lyu, Michael R. [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Shenzhen CUHK Shenzhen, Sch Data Sci, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Software testing; conversational models; social bias;
D O I
10.1145/3611643.3616310
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Powered by advanced Artificial Intelligence (AI) techniques, conversational AI systems, such as ChatGPT, and digital assistants like Siri, have been widely deployed in daily life. However, such systems may still produce content containing biases and stereotypes, causing potential social problems. Due to modern AI techniques' data-driven, black-box nature, comprehensively identifying and measuring biases in conversational systems remains challenging. Particularly, it is hard to generate inputs that can comprehensively trigger potential bias due to the lack of data containing both social groups and biased properties. In addition, modern conversational systems can produce diverse responses (e.g., chatting and explanation), which makes existing bias detection methods based solely on sentiment and toxicity hardly being adopted. In this paper, we propose BiasAsker, an automated framework to identify and measure social bias in conversational AI systems. To obtain social groups and biased properties, we construct a comprehensive social bias dataset containing a total of 841 groups and 5,021 biased properties. Given the dataset, BiasAsker automatically generates questions and adopts a novel method based on existence measurement to identify two types of biases (i.e., absolute bias and related bias) in conversational systems. Extensive experiments on eight commercial systems and two famous research models, such as ChatGPT and GPT-3, show that 32.83% of the questions generated by BiasAsker can trigger biased behaviors in these widely deployed conversational systems. All the code, data, and experimental results have been released to facilitate future research.
引用
收藏
页码:515 / 527
页数:13
相关论文
共 50 条
  • [31] Key Considerations for Incorporating Conversational AI in Psychotherapy
    Miner, Adam S.
    Shah, Nigam
    Bullock, Kim D.
    Arnow, Bruce A.
    Bailenson, Jeremy
    Hancock, Jeff
    FRONTIERS IN PSYCHIATRY, 2019, 10
  • [32] Search-Oriented Conversational AI (SCAI)
    Burtsev, Mikhail
    Chuklin, Aleksandr
    Kiseleva, Julia
    Borisov, Alexey
    ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 333 - 334
  • [33] Managing Bias in AI
    Roselli, Drew
    Matthews, Jeanna
    Talagala, Nisha
    COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ), 2019, : 539 - 544
  • [34] Biometrics and AI Bias
    Michael, Katina
    Abbas, Roba
    Jayashree, Payyazhi
    Bandara, Ruwan J.
    Aloudat, Anas
    IEEE Transactions on Technology and Society, 2022, 3 (01): : 2 - 8
  • [35] Understanding Avoidance Behaviors of Users for Conversational AI
    Wu, Siyuan
    Shu, Yatong
    Yang, Xinyue
    Huang, Zilin
    Zhang, Xuzheng
    Chen, Xiyin
    Peng, Guochao
    DISTRIBUTED, AMBIENT AND PERVASIVE INTERACTIONS, DAPI 2023, PT I, 2023, 14036 : 281 - 294
  • [36] Development of Conversational AI for Sleep Coaching Programme
    Shim, Heereen
    EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 121 - 128
  • [37] Simplifying Network Orchestration using Conversational AI
    Panchal, Deven
    Verma, Prafulla
    Baran, Isilay
    Musgrove, Dan
    Lu, David
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 84 - 89
  • [38] Alexa Prize - State of the Art in Conversational AI
    Khatri, Chandra
    Venkatesh, Anu
    Hedayatnia, Behnam
    Ram, Ashwin
    Gabriel, Raefer
    Prasad, Rohit
    AI MAGAZINE, 2018, 39 (03) : 40 - 55
  • [39] Requirements Conflicts Detection: Advancing with Conversational AI
    Kisso, George
    Fotrousi, Farnaz
    32ND INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS, REW 2024, 2024, : 101 - 107
  • [40] Exploring the impact of ChatGPT: conversational AI in education
    Bettayeb, Anissa M.
    Abu Talib, Manar
    Altayasinah, Al Zahraa Sobhe
    Dakalbab, Fatima
    FRONTIERS IN EDUCATION, 2024, 9