Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

被引：0

作者：

Dana Brin

Vera Sorin

Akhil Vaid

Ali Soroush

Benjamin S. Glicksberg

Alexander W. Charney

Girish Nadkarni

Eyal Klang

机构：

[1] Chaim Sheba Medical Center,Department of Diagnostic Imaging

[2] Tel-Aviv University,Faculty of Medicine

[3] Icahn School of Medicine at Mount Sinai,The Charles Bronfman Institute of Personalized Medicine

[4] Icahn School of Medicine at Mount Sinai,Division of Data

[5] Icahn School of Medicine at Mount Sinai,Driven and Digital Medicine (D3M)

[6] Icahn School of Medicine at Mount Sinai,Hasso Plattner Institute for Digital Health

来源：

Scientific Reports | / 13卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models’ consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT’s 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.

引用

共 50 条

[21] Is GPT-4 capable of passing MIR 2023? Comparison between GPT-4 and ChatGPT-3 in the MIR 2022 and 2023 exams
Cerame, Alvaro
Juaneda, Juan
Estrella-Porter, Pablo
de la Puente, Lucia
Navarro, Joaquin
Garcia, Eva
Sanchez, Domingo A.
Carrasco, Juan Pablo
SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (02):
[22] GPT-4: the future of artificial intelligence in medical school assessments
Haruna-Cooper, Lois
Rashid, Mohammed Ahmed
JOURNAL OF THE ROYAL SOCIETY OF MEDICINE, 2023, 116 (06) : 218 - 219
[23] ChatGPT与GPT-4的科技新范式思考
翟尤
互联网天地, 2023, (04) : 28 - 33
[24] The potential impact of ChatGPT/GPT-4 on surgery: will it topple the profession of surgeons?
Cheng, Kunming
Sun, Zaijie
He, Yongbin
Gu, Shuqin
Wu, Haiyang
INTERNATIONAL JOURNAL OF SURGERY, 2023, 109 (05) : 1545 - 1547
[25] Evaluating prompt engineering on GPT-3.5's performance in USMLE-style medical calculations and clinical scenarios generated by GPT-4
Patel, Dhavalkumar
Raut, Ganesh
Zimlichman, Eyal
Cheetirala, Satya Narayan
Nadkarni, Girish N.
Glicksberg, Benjamin S.
Apakama, Donald U.
Bell, Elijah J.
Freeman, Robert
Timsina, Prem
Klang, Eyal
SCIENTIFIC REPORTS, 2024, 14 (01):
[26] Performance of GPT-4 on Chinese Nursing Examination
Miao, Yiqun
Luo, Yuan
Zhao, Yuhan
Li, Jiawei
Liu, Mingxuan
Wang, Huiying
Chen, Yuling
Wu, Ying
NURSE EDUCATOR, 2024, 49 (06) : E338 - E343
[27] Comparing the Diagnostic Performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and Radiologists in Challenging Neuroradiology Cases
Horiuchi, Daisuke
Tatekawa, Hiroyuki
Oura, Tatsushi
Oue, Satoshi
Walston, Shannon L.
Takita, Hirotaka
Matsushita, Shu
Mitsuyama, Yasuhito
Shimono, Taro
Miki, Yukio
Ueda, Daiju
CLINICAL NEURORADIOLOGY, 2024, : 779 - 787
[28] INTERVENTIONAL NEPHROLOGY ASSESSMENT QUESTIONS: A PERFORMANCE EVALUATION AND COMPARATIVE ANALYSIS OF CHATGPT-3.5 AND GPT-4
Sheikh, Mohammad
Qureshi, Fawad
Thongprayoon, Charat
Suarez, Lourdes Gonzalez
Craici, Lasmina
Cheungpasitporn, Visit
AMERICAN JOURNAL OF KIDNEY DISEASES, 2024, 83 (04) : S100 - S101
[29] Große Sprachmodelle wie ChatGPT und GPT-4 für eine patientenzentrierte RadiologieLarge language models such as ChatGPT and GPT-4 for patient-centered care in radiology
Matthias A. Fink
Die Radiologie, 2023, 63 : 665 - 671
[30] Artificial Intelligence in Intensive Care Medicine: Toward a ChatGPT/GPT-4 Way
Lu, Yanqiu
Wu, Haiyang
Qi, Shaoyan
Cheng, Kunming
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (09) : 1898 - 1903

← 1 2 3 4 5 →