Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments

被引：0

作者：

Dana Brin

Vera Sorin

Akhil Vaid

Ali Soroush

Benjamin S. Glicksberg

Alexander W. Charney

Girish Nadkarni

Eyal Klang

机构：

[1] Chaim Sheba Medical Center,Department of Diagnostic Imaging

[2] Tel-Aviv University,Faculty of Medicine

[3] Icahn School of Medicine at Mount Sinai,The Charles Bronfman Institute of Personalized Medicine

[4] Icahn School of Medicine at Mount Sinai,Division of Data

[5] Icahn School of Medicine at Mount Sinai,Driven and Digital Medicine (D3M)

[6] Icahn School of Medicine at Mount Sinai,Hasso Plattner Institute for Digital Health

来源：

Scientific Reports | / 13卷

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models’ consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT’s 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.

引用

共 50 条

[31] ChatGPT surges ahead: GPT-4 has arrived in the arena of medical research
Wang, Ying-Mei
Chen, Tzeng-Ji
JOURNAL OF THE CHINESE MEDICAL ASSOCIATION, 2023, 86 (09) : 784 - 785
[32] Improving the Readability of Generated Tests Using GPT-4 and ChatGPT Code Interpreter
Gay, Gregory
SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2023, 2024, 14415 : 140 - 146
[33] Artificial Intelligence in Intensive Care Medicine: Toward a ChatGPT/GPT-4 Way?
Yanqiu Lu
Haiyang Wu
Shaoyan Qi
Kunming Cheng
Annals of Biomedical Engineering, 2023, 51 : 1898 - 1903
[34] Comparing the performance of ChatGPT GPT-4, Bard, and Llama-2 in the Taiwan Psychiatric Licensing Examination and in differential diagnosis with multi-center psychiatrists
Li, Dian-Jeng
Kao, Yu-Chen
Tsai, Shih-Jen
Bai, Ya-Mei
Yeh, Ta-Chuan
Chu, Che-Sheng
Hsu, Chih-Wei
Cheng, Szu-Wei
Hsu, Tien-Wei
Liang, Chih-Sung
Su, Kuan-Pin
PSYCHIATRY AND CLINICAL NEUROSCIENCES, 2024, 78 (06) : 347 - 352
[35] Large language models for structured reporting in radiology: performance of GPT-4, ChatGPT-3.5, Perplexity and Bing
Carlo A. Mallio
Andrea C. Sertorio
Caterina Bernetti
Bruno Beomonte Zobel
La radiologia medica, 2023, 128 : 808 - 812
[36] The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education
Rizzo, Michael G.
Cai, Nathan
Constantinescu, David
JOURNAL OF ORTHOPAEDICS, 2024, 50 : 70 - 75
[37] Comparing GPT-3.5 and GPT-4 Accuracy and Drift in Radiology Diagnosis Please Cases
Li, David
Gupta, Kartik
Bhaduri, Mousumi
Sathiadoss, Paul
Bhatnagar, Sahir
Chong, Jaron
RADIOLOGY, 2024, 310 (01)
[38] Large language models for structured reporting in radiology: performance of GPT-4, ChatGPT-3.5, Perplexity and Bing
Mallio, Carlo A.
Sertorio, Andrea C.
Bernetti, Caterina
Beomonte Zobel, Bruno
RADIOLOGIA MEDICA, 2023, 128 (07): : 808 - 812
[39] Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis
Liu, Xu
Duan, Chaoli
Kim, Min-kyu
Zhang, Lu
Jee, Eunjin
Maharjan, Beenu
Huang, Yuwei
Du, Dan
Jiang, Xian
JMIR MEDICAL INFORMATICS, 2024, 12
[40] Performance of Novel GPT-4 in Otolaryngology Knowledge Assessment
Revercomb, Lucy
Patel, Aman M.
Fu, Daniel
Filimonov, Andrey
INDIAN JOURNAL OF OTOLARYNGOLOGY AND HEAD & NECK SURGERY, 2024, 76 (06) : 6112 - 6114

← 1 2 3 4 5 →