Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

被引：0

作者：

Dana Brin

Vera Sorin

Akhil Vaid

Ali Soroush

Benjamin S. Glicksberg

Alexander W. Charney

Girish Nadkarni

Eyal Klang

机构：

[1] Chaim Sheba Medical Center,Department of Diagnostic Imaging

[2] Tel-Aviv University,Faculty of Medicine

[3] Icahn School of Medicine at Mount Sinai,The Charles Bronfman Institute of Personalized Medicine

[4] Icahn School of Medicine at Mount Sinai,Division of Data

[5] Icahn School of Medicine at Mount Sinai,Driven and Digital Medicine (D3M)

[6] Icahn School of Medicine at Mount Sinai,Hasso Plattner Institute for Digital Health

来源：

Scientific Reports | / 13卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models’ consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT’s 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.

引用

共 50 条

[1] Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments
Brin, Dana
Sorin, Vera
Vaid, Akhil
Soroush, Ali
Glicksberg, Benjamin S.
Charney, Alexander W.
Nadkarni, Girish
Klang, Eyal
SCIENTIFIC REPORTS, 2023, 13 (01)
[2] ChatGPT/GPT-4 and Spinal Surgeons
Amnuay Kleebayoon
Viroj Wiwanitkit
Annals of Biomedical Engineering, 2023, 51 : 1657 - 1657
[3] ChatGPT/GPT-4 and Spinal Surgeons
Kleebayoon, Amnuay
Wiwanitkit, Viroj
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (08) : 1657 - 1657
[4] Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations
Ali, Rohaid
Tang, Oliver Y.
Connolly, Ian D.
Sullivan, Patricia L. Zadnik
Shin, John H.
Fridley, Jared S.
Asaad, Wael F.
Cielo, Deus
Oyelese, Adetokunbo A.
Doberstein, Curtis E.
Gokaslan, Ziya L.
Telfeian, Albert E.
NEUROSURGERY, 2023, 93 (06) : 1353 - 1365
[5] Letter: Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations
Zhu, Huali
Kong, Yi
NEUROSURGERY, 2024, 95 (03) : e80 - e80
[6] Letter: Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations
Wang, Shuo
Kinoshita, Shotaro
Yokoyama, Hiromi M.
NEUROSURGERY, 2024, 95 (05) : e151 - e152
[7] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
Yongbin He
Haifeng Tang
Dongxue Wang
Shuqin Gu
Guoxin Ni
Haiyang Wu
Annals of Biomedical Engineering, 2023, 51 : 1362 - 1365
[8] Performance of ChatGPT and GPT-4 on Polish National Specialty Exam (NSE) in Ophthalmology
Ciekalski, Marcin
Laskowski, Maciej
Koperczak, Agnieszka
Smierciak, Maria
Sirek, Sebastian
POSTEPY HIGIENY I MEDYCYNY DOSWIADCZALNEJ, 2024, 78 (01): : 111 - 116
[9] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
He, Yongbin
Tang, Haifeng
Wang, Dongxue
Gu, Shuqin
Ni, Guoxin
Wu, Haiyang
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (07) : 1362 - 1365
[10] ChatGPT, GPT-4, and Bard and official board examination: comment
Daungsupawong, Hinpetch
Wiwanitkit, Viroj
JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (02) : 212 - 213

← 1 2 3 4 5 →