Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

被引:0
|
作者
Dana Brin
Vera Sorin
Akhil Vaid
Ali Soroush
Benjamin S. Glicksberg
Alexander W. Charney
Girish Nadkarni
Eyal Klang
机构
[1] Chaim Sheba Medical Center,Department of Diagnostic Imaging
[2] Tel-Aviv University,Faculty of Medicine
[3] Icahn School of Medicine at Mount Sinai,The Charles Bronfman Institute of Personalized Medicine
[4] Icahn School of Medicine at Mount Sinai,Division of Data
[5] Icahn School of Medicine at Mount Sinai,Driven and Digital Medicine (D3M)
[6] Icahn School of Medicine at Mount Sinai,Hasso Plattner Institute for Digital Health
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models’ consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT’s 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.
引用
收藏
相关论文
共 50 条
  • [41] Large language models such as ChatGPT and GPT-4 for patient-centered care in radiology
    Fink, Matthias A.
    RADIOLOGIE, 2023, 63 (09): : 665 - 671
  • [42] ChatGPT, GPT-4, and Other Large Language Models: The Next Revolution for Clinical Microbiology?
    Egli, Adrian
    CLINICAL INFECTIOUS DISEASES, 2023, 77 (09) : 1322 - 1328
  • [43] ChatGPT With GPT-4 Outperforms Emergency Department Physicians in Diagnostic Accuracy: Retrospective Analysis
    Hoppe, John Michael
    Auer, Matthias K.
    Strueven, Anna
    Massberg, Steffen
    Stremmel, Christopher
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [44] Beyond ChatGPT: What does GPT-4 add to healthcare? The dawn of a new era
    Wojcik, Simona
    Rulkiewicz, Anna
    Pruszczyk, Piotr
    Lisik, Wojciech
    Pobozy, Marcin
    Domienik-Karlowicz, Justyna
    CARDIOLOGY JOURNAL, 2023, 30 (06) : 1018 - 1025
  • [45] GPT-4 Vision: Multi-Modal Evolution of ChatGPT and Potential Role in Radiology
    Javan, Ramin
    Kim, Theodore
    Mostaghni, Navid
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (08)
  • [46] ChatGPT and GPT-4: utilities in the legal sector, functioning, limitations and risks of foundational models
    Gomez, Francisco Julio Dosal
    Galende, Judith Nieto
    TECNOLOGIA CIENCIA Y EDUCACION, 2024, (28): : 45 - 88
  • [47] Letter to the editor response to “ChatGPT, GPT-4, and bard and official board examination: comment”
    Ayaka Harigai
    Yoshitaka Toyama
    Kei Takase
    Japanese Journal of Radiology, 2024, 42 : 214 - 215
  • [48] Letter to the editor response to "ChatGPT, GPT-4, and bard and official board examination: comment"
    Harigai, Ayaka
    Toyama, Yoshitaka
    Takase, Kei
    JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (02) : 214 - 215
  • [49] ChatGPT and GPT-4 in Ophthalmology: Applications of Large Language Model Artificial Intelligence in Retina
    Ong, Joshua
    Hariprasad, Seenu M.
    Chhablani, Jay
    OPHTHALMIC SURGERY LASERS & IMAGING RETINA, 2023, 54 (10): : 557 - 562
  • [50] Comment on: ‘Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination’ and ‘ChatGPT in ophthalmology: the dawn of a new era?’
    Nima Ghadiri
    Eye, 2024, 38 : 654 - 655