Performance of ChatGPT and GPT-4 on Polish National Specialty Exam (NSE) in Ophthalmology

被引:0
|
作者
Ciekalski, Marcin [1 ]
Laskowski, Maciej [1 ]
Koperczak, Agnieszka [1 ]
Smierciak, Maria [1 ]
Sirek, Sebastian [2 ]
机构
[1] Med Univ Silesia, Fac Med Sci Katowice, Student Sci Soc, Dept Ophthalmol, Katowice, Poland
[2] Med Univ Siles, Fac Med Sci Katowice, Dept Ophthalmol, Katowice, Poland
来源
关键词
ophthalmology; ChatGPT; Polish national specialty exam;
D O I
10.2478/ahem-2024-0006
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Introduction Artificial intelligence (AI) has evolved significantly, driven by advancements in computing power and big data. Technologies like machine learning and deep learning have led to sophisticated models such as GPT-3.5 and GPT-4. This study assesses the performance of these AI models on the Polish National Specialty Exam in ophthalmology, exploring their potential to support research, education, and clinical decision-making in healthcare.Materials and Methods The study analyzed 98 questions from the Spring 2023 Polish NSE in Ophthalmology. Questions were categorized into five groups: Physiology & Diagnostics, Clinical & Case Questions, Treatment & Pharmacology, Surgery, and Pediatrics. GPT-3.5 and GPT-4 were tested for their accuracy in answering these questions, with a confidence rating from 1 to 5 assigned to each response. Statistical analyses, including the Chi-squared test and Mann-Whitney U test, were employed to compare the models' performance.Results GPT-4 demonstrated a significant improvement over GPT-3.5, correctly answering 63.3% of questions compared to GPT-3.5's 37.8%. GPT-4's performance met the passing criteria for the NSE. The models showed varying degrees of accuracy across different categories, with a notable gap in fields like surgery and pediatrics.Conclusions The study highlights the potential of GPT models in aiding clinical decisions and educational purposes in ophthalmology. However, it also underscores the models' limitations, particularly in specialized fields like surgery and pediatrics. The findings suggest that while AI models like GPT-3.5 and GPT-4 can significantly assist in the medical field, they require further development and fine-tuning to address specific challenges in various medical domains.
引用
收藏
页码:111 / 116
页数:6
相关论文
共 50 条
  • [21] Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training
    Alfredo Madrid-García
    Zulema Rosales-Rosado
    Dalifer Freites-Nuñez
    Inés Pérez-Sancristóbal
    Esperanza Pato-Cour
    Chamaida Plasencia-Rodríguez
    Luis Cabeza-Osorio
    Lydia Abasolo-Alcázar
    Leticia León-Mateos
    Benjamín Fernández-Gutiérrez
    Luis Rodríguez-Rodríguez
    Scientific Reports, 13 (1)
  • [22] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
    Yongbin He
    Haifeng Tang
    Dongxue Wang
    Shuqin Gu
    Guoxin Ni
    Haiyang Wu
    Annals of Biomedical Engineering, 2023, 51 : 1362 - 1365
  • [23] Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination
    Lin, John C. C.
    Younessi, David N. N.
    Kurapati, Sai S. S.
    Tang, Oliver Y. Y.
    Scott, Ingrid U. U.
    EYE, 2023, 37 (17) : 3694 - 3695
  • [24] Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination
    John C. Lin
    David N. Younessi
    Sai S. Kurapati
    Oliver Y. Tang
    Ingrid U. Scott
    Eye, 2023, 37 : 3694 - 3695
  • [25] Will ChatGPT/GPT-4 be a Lighthouse to Guide Spinal Surgeons?
    He, Yongbin
    Tang, Haifeng
    Wang, Dongxue
    Gu, Shuqin
    Ni, Guoxin
    Wu, Haiyang
    ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (07) : 1362 - 1365
  • [26] An exploratory assessment of GPT-4o and GPT-4 performance on the Japanese National Dental Examination
    Morishita, Masaki
    Fukuda, Hikaru
    Yamaguchi, Shino
    Muraoka, Kosuke
    Nakamura, Taiji
    Hayashi, Masanari
    Yoshioka, Izumi
    Ono, Kentaro
    Awano, Shuji
    SAUDI DENTAL JOURNAL, 2024, 36 (12) : 1577 - 1581
  • [27] ChatGPT, GPT-4, and Bard and official board examination: comment
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (02) : 212 - 213
  • [28] ChatGPT, GPT-4, and Bard and official board examination: comment
    Hinpetch Daungsupawong
    Viroj Wiwanitkit
    Japanese Journal of Radiology, 2024, 42 : 212 - 213
  • [29] Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
    Chang, Kent K.
    Cramer, Mackenzie
    Soni, Sandeep
    Bamman, David
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 7312 - 7327
  • [30] ChatGPT/GPT-4: enabling a new era of surgical oncology
    Cheng, Kunming
    Wu, Haiyang
    Li, Cheng
    INTERNATIONAL JOURNAL OF SURGERY, 2023, 109 (08) : 2549 - 2550