Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引:16
|
作者
Sumbal, Anusha [1 ]
Sumbal, Ramish [1 ]
Amir, Alina [1 ]
机构
[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan
关键词
ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;
D O I
10.1177/23821205241238641
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] This too shall pass: the performance of ChatGPT-3.5, ChatGPT-4 and New Bing in an Australian medical licensing examination
    Kleinig, Oliver
    Gao, Christina
    Bacchi, Stephen
    MEDICAL JOURNAL OF AUSTRALIA, 2023, 219 (05)
  • [2] Can ChatGPT pass a nursing exam?
    Allen, Chris
    Woodnutt, Samuel
    INTERNATIONAL JOURNAL OF NURSING STUDIES, 2023, 145
  • [3] Can ChatGPT pass the thoracic surgery exam?
    Gencer, Adem
    Aydin, Suphi
    AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2023, 366 (04): : 291 - 295
  • [4] ChatGPT-3.5 passes Poland's medical final examination-Is it possible for ChatGPT to become a doctor in Poland?
    Suwala, Szymon
    Szulc, Paulina
    Guzowski, Cezary
    Kaminska, Barbara
    Dorobiala, Jakub
    Wojciechowska, Karolina
    Berska, Maria
    Kubicka, Olga
    Kosturkiewicz, Oliwia
    Kosztulska, Bernadetta
    Rajewska, Alicja
    Junik, Roman
    SAGE OPEN MEDICINE, 2024, 12
  • [5] Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination
    Uehara, Osamu
    Morikawa, Tetsuro
    Harada, Fumiya
    Sugiyama, Nodoka
    Matsuki, Yuko
    Hiraki, Daichi
    Sakurai, Hinako
    Kado, Takashi
    Yoshida, Koki
    Murata, Yukie
    Matsuoka, Hirofumi
    Nagasawa, Toshiyuki
    Furuichi, Yasushi
    Abiko, Yoshihiro
    Miura, Hiroko
    JOURNAL OF DENTAL EDUCATION, 2024,
  • [6] Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations
    Massey, Patrick A.
    Montgomery, Carver
    Zhang, Andrew S.
    JOURNAL OF THE AMERICAN ACADEMY OF ORTHOPAEDIC SURGEONS, 2023, 31 (23) : 1173 - 1179
  • [7] Evaluating the performance of ChatGPT-3.5 and ChatGPT-4 on the Taiwan plastic surgery board examination
    Hsieh, Ching-Hua
    Hsieh, Hsiao-Yun
    Lin, Hui-Ping
    HELIYON, 2024, 10 (14)
  • [8] Assessment Study of ChatGPT-3.5's Performance on the Final Polish Medical Examination: Accuracy in Answering 980 Questions
    Siebielec, Julia
    Ordak, Michal
    Oskroba, Agata
    Dworakowska, Anna
    Bujalska-Zadrozny, Magdalena
    HEALTHCARE, 2024, 12 (16)
  • [9] Performance of ChatGPT-3.5 and ChatGPT-4 on the European Board of Urology (EBU) exams: a comparative analysis
    Schoch, Justine
    Schmelz, H. -u.
    Strauch, Angelina
    Borgmann, Hendrik
    Nestler, Tim
    WORLD JOURNAL OF UROLOGY, 2024, 42 (01)
  • [10] Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study
    Wang, Ying-Mei
    Shen, Hung-Wei
    Chen, Tzeng-Ji
    Chiang, Shu-Chiung
    Lin, Ting-Guan
    JMIR MEDICAL EDUCATION, 2025, 11