Evaluation and mitigation of cognitive biases in medical language models

被引:1
|
作者
Schmidgall, Samuel [1 ]
Harris, Carl [2 ]
Essien, Ime [2 ]
Olshvang, Daniel [2 ]
Rahman, Tawsifur [2 ]
Kim, Ji Woong [3 ]
Ziaei, Rojin [4 ]
Eshraghian, Jason [5 ]
Abadir, Peter [6 ]
Chellappa, Rama [1 ,2 ]
机构
[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD USA
[3] Johns Hopkins Univ, Dept Mech Engn, Baltimore, MD USA
[4] Univ Maryland, Dept Comp Sci, College Pk, MD USA
[5] Univ Calif Santa Cruz, Dept Elect & Comp Engn, Santa Cruz, CA USA
[6] Johns Hopkins Univ, Sch Med, Div Geriatr Med & Gerontol, Baltimore, MD USA
来源
NPJ DIGITAL MEDICINE | 2024年 / 7卷 / 01期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
D O I
10.1038/s41746-024-01283-6
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Increasing interest in applying large language models (LLMs) to medicine is due in part to their impressive performance on medical exam questions. However, these exams do not capture the complexity of real patient-doctor interactions because of factors like patient compliance, experience, and cognitive bias. We hypothesized that LLMs would produce less accurate responses when faced with clinically biased questions as compared to unbiased ones. To test this, we developed the BiasMedQA dataset, which consists of 1273 USMLE questions modified to replicate common clinically relevant cognitive biases. We assessed six LLMs on BiasMedQA and found that GPT-4 stood out for its resilience to bias, in contrast to Llama 2 70B-chat and PMC Llama 13B, which showed large drops in performance. Additionally, we introduced three bias mitigation strategies, which improved but did not fully restore accuracy. Our findings highlight the need to improve LLMs' robustness to cognitive biases, in order to achieve more reliable applications of LLMs in healthcare.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Benchmarking Cognitive Biases in Large Language Models as Evaluators
    Koo, Ryan
    Lee, Minhwa
    Raheja, Vipul
    Park, Jongin
    Kim, Zae Myung
    Kang, Dongyeop
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 517 - 545
  • [2] (Ir)rationality and cognitive biases in large language models
    Macmillan-Scott, Olivia
    Musolesi, Mirco
    ROYAL SOCIETY OPEN SCIENCE, 2024, 11 (06):
  • [3] Biases Mitigation and Expressiveness Preservation in Language Models: A Comprehensive Pipeline (Student Abstract)
    Yu, Liu
    Guo, Ludie
    Kuang, Ping
    Zhou, Fan
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23701 - 23702
  • [4] Capturing Failures of Large Language Models via Human Cognitive Biases
    Jones, Erik
    Steinhardt, Jacob
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Robust Evaluation Measures for Evaluating Social Biases in Masked Language Models
    Liu, Yang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18707 - 18715
  • [6] Mind the Biases: Quantifying Cognitive Biases in Language Model Prompting
    Lin, Ruixi
    Ng, Hwee Tou
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5269 - 5281
  • [7] CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models
    Zhao, Jiaxu
    Fang, Meng
    Shi, Zijing
    Li, Yitong
    Chen, Ling
    Pechenizkiy, Mykola
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13538 - 13556
  • [8] Evaluation of Cognitive Architectures Inspired by Cognitive Biases
    Doell, Christoph
    Siebert, Sophie
    7TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, (BICA 2016), 2016, 88 : 155 - 162
  • [9] Likelihood-based Mitigation of Evaluation Bias in Large Language Models
    Ohi, Masanari
    Kaneko, Masahiro
    Koike, Ryuto
    Loem, Mengsay
    Okazaki, Naoaki
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3237 - 3245
  • [10] The Role of Cognitive Biases in Criminal Intelligence Analysis and Approaches for their Mitigation
    Hillemann, Eva-Catherine
    Nussbaumer, Alexander
    Albert, Dietrich
    2015 European Intelligence and Security Informatics Conference (EISIC), 2015, : 125 - 128