Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引：16

作者：

Sumbal, Anusha ^{[1
]}

Sumbal, Ramish ^{[1
]}

Amir, Alina ^{[1
]}

机构：

[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan

来源：

JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT | 2024年 / 11卷

关键词：

ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;

D O I：

10.1177/23821205241238641

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.

引用

页数：12

共 50 条

[1] This too shall pass: the performance of ChatGPT-3.5, ChatGPT-4 and New Bing in an Australian medical licensing examination
Kleinig, Oliver
Gao, Christina
Bacchi, Stephen
MEDICAL JOURNAL OF AUSTRALIA, 2023, 219 (05)
[2] Can ChatGPT pass a nursing exam?
Allen, Chris
Woodnutt, Samuel
INTERNATIONAL JOURNAL OF NURSING STUDIES, 2023, 145
[3] Can ChatGPT pass the thoracic surgery exam?
Gencer, Adem
Aydin, Suphi
AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2023, 366 (04): : 291 - 295
[4] ChatGPT-3.5 passes Poland's medical final examination-Is it possible for ChatGPT to become a doctor in Poland?
Suwala, Szymon
Szulc, Paulina
Guzowski, Cezary
Kaminska, Barbara
Dorobiala, Jakub
Wojciechowska, Karolina
Berska, Maria
Kubicka, Olga
Kosturkiewicz, Oliwia
Kosztulska, Bernadetta
Rajewska, Alicja
Junik, Roman
SAGE OPEN MEDICINE, 2024, 12
[5] Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination
Uehara, Osamu
Morikawa, Tetsuro
Harada, Fumiya
Sugiyama, Nodoka
Matsuki, Yuko
Hiraki, Daichi
Sakurai, Hinako
Kado, Takashi
Yoshida, Koki
Murata, Yukie
Matsuoka, Hirofumi
Nagasawa, Toshiyuki
Furuichi, Yasushi
Abiko, Yoshihiro
Miura, Hiroko
JOURNAL OF DENTAL EDUCATION, 2024,
[6] Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations
Massey, Patrick A.
Montgomery, Carver
Zhang, Andrew S.
JOURNAL OF THE AMERICAN ACADEMY OF ORTHOPAEDIC SURGEONS, 2023, 31 (23) : 1173 - 1179
[7] Evaluating the performance of ChatGPT-3.5 and ChatGPT-4 on the Taiwan plastic surgery board examination
Hsieh, Ching-Hua
Hsieh, Hsiao-Yun
Lin, Hui-Ping
HELIYON, 2024, 10 (14)
[8] Assessment Study of ChatGPT-3.5's Performance on the Final Polish Medical Examination: Accuracy in Answering 980 Questions
Siebielec, Julia
Ordak, Michal
Oskroba, Agata
Dworakowska, Anna
Bujalska-Zadrozny, Magdalena
HEALTHCARE, 2024, 12 (16)
[9] Performance of ChatGPT-3.5 and ChatGPT-4 on the European Board of Urology (EBU) exams: a comparative analysis
Schoch, Justine
Schmelz, H. -u.
Strauch, Angelina
Borgmann, Hendrik
Nestler, Tim
WORLD JOURNAL OF UROLOGY, 2024, 42 (01)
[10] Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study
Wang, Ying-Mei
Shen, Hung-Wei
Chen, Tzeng-Ji
Chiang, Shu-Chiung
Lin, Ting-Guan
JMIR MEDICAL EDUCATION, 2025, 11

← 1 2 3 4 5 →