Response to correspondence regarding "Analysis of large-language model versus human performance for genetics questions"

被引:1
|
作者
Duong, Dat [1 ]
Solomon, Benjamin D. [1 ]
机构
[1] Natl Human Genome Res Inst, Med Genom Unit, Med Genet Branch, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1038/s41431-023-01444-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Large-language models like ChatGPT have recently received a great deal of attention. One area of interest pertains to how these models could be used in biomedical contexts, including related to human genetics. To assess one facet of this, we compared the performance of ChatGPT versus human respondents (13,642 human responses) in answering 85 multiple-choice questions about aspects of human genetics. Overall, ChatGPT did not perform significantly differently (p = 0.8327) than human respondents; ChatGPT was 68.2% accurate, compared to 66.6% accuracy for human respondents. Both ChatGPT and humans performed better on memorization-type questions versus critical thinking questions (p < 0.0001). When asked the same question multiple times, ChatGPT frequently provided different answers (16% of initial responses), including for both initially correct and incorrect answers, and gave plausible explanations for both correct and incorrect answers. ChatGPT's performance was impressive, but currently demonstrates significant shortcomings for clinical or other high-stakes use. Addressing these limitations will be important to guide adoption in real-life situations. © 2023. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.
引用
收藏
页码:379 / 380
页数:2
相关论文
共 44 条
  • [1] Response to correspondence regarding “Analysis of large-language model versus human performance for genetics questions”
    Dat Duong
    Benjamin D. Solomon
    European Journal of Human Genetics, 2024, 32 : 379 - 380
  • [2] Analysis of large-language model versus human performance for genetics questions
    Dat Duong
    Benjamin D. Solomon
    European Journal of Human Genetics, 2024, 32 : 466 - 468
  • [3] Performance of a Large-Language Model in scoring construction management capstone design projects
    Castelblanco, Gabriel
    Cruz-Castro, Laura
    Yang, Zhenlin
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2024, 32 (06)
  • [4] Understanding Large-Language Model (LLM)-powered Human-Robot Interaction
    Kim, Callie Y.
    Lee, Christine P.
    Mutlu, Bilge
    PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024, 2024, : 371 - 380
  • [5] Correspondence on “Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery”
    Namria Ishaaq
    Shahab Saquib Sohail
    Obesity Surgery, 2023, 33 : 4159 - 4159
  • [6] Correspondence on "Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery"
    Ishaaq, Namria
    Sohail, Shahab Saquib
    OBESITY SURGERY, 2023, 33 (12) : 4159 - 4159
  • [7] Comparative Analysis of Multimodal Large Language Model Performance on Clinical Vignette Questions
    Han, Tianyu
    Adams, Lisa C.
    Bressem, Keno K.
    Busch, Felix
    Nebelung, Sven
    Truhn, Daniel
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2024, 331 (15): : 1320 - 1321
  • [8] Effects of prompt engineering on large language model performance in response to questions on common ophthalmic conditions
    Wu, Jo-Hsuan
    Nishida, Takashi
    Moghimi, Sasan
    Weinreb, Robert N.
    TAIWAN JOURNAL OF OPHTHALMOLOGY, 2024, 14 (03) : 454 - +
  • [9] Performance of a Large Language Model on Practice Questions for the Neonatal Board Examination
    Beam, Kristyn
    Sharma, Puneet
    Kumar, Bhawesh
    Wang, Cindy
    Brodsky, Dara
    Martin, Camilia R.
    Beam, Andrew
    JAMA PEDIATRICS, 2023, 177 (09) : 977 - 979
  • [10] Performance of large language model artificial intelligence on dermatology board exam questions
    Park, Lily
    Ehlert, Brittany
    Susla, Lyudmyla
    Lum, Zachary C.
    Lee, Patrick K.
    CLINICAL AND EXPERIMENTAL DERMATOLOGY, 2023, 49 (07) : 733 - 734