Performance of ChatGPT in the In-Training Examination for Anesthesiology and Pain Medicine Residents in South Korea: Observational Study

被引:0
|
作者
Yoon, Soo-Hyuk [1 ]
Oh, Seok Kyeong [2 ]
Lim, Byung Gun [2 ]
Lee, Ho-Jin [1 ]
机构
[1] Seoul Natl Univ, Coll Med, Seoul Natl Univ Hosp, Dept Anesthesiol & Pain Med, Daehak Ro 101, Seoul 03080, South Korea
[2] Korea Univ, Guro Hosp, Coll Med, Dept Anesthesiol & Pain Med, Seoul, South Korea
来源
JMIR MEDICAL EDUCATION | 2024年 / 10卷
关键词
AI tools; problem solving; anesthesiology; artificial intelligence; pain medicine; ChatGPT; health care; medical education; South Korea; BOARD;
D O I
10.2196/56859
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Background: ChatGPT has been tested in health care, including the US Medical Licensing Examination and specialty exams,showing near-passing results. Its performance in the field of anesthesiology has been assessed using English board examination questions; however, its effectiveness in Korea remains unexplored. Objective: This study investigated the problem-solving performance of ChatGPT in the fields of anesthesiology and painmedicine in the Korean language context, highlighted advancements in artificial intelligence (AI), and explored its potentialapplications in medical education. Methods: We investigated the performance (number of correct answers/number of questions) of GPT-4, GPT-3.5, and CLOVAX in the fields of anesthesiology and pain medicine, using in-training examinations that have been administered to Koreananesthesiology residents over the past 5 years, with an annual composition of 100 questions. Questions containing images,diagrams, or photographs were excluded from the analysis. Furthermore, to assess the performance differences of the GPT acrossdifferent languages, we conducted a comparative analysis of the GPT-4's problem-solving proficiency using both the originalKorean texts and their English translations. Results: A total of 398 questions were analyzed. GPT-4 (67.8%) demonstrated a significantly better overall performance thanGPT-3.5 (37.2%) and CLOVA-X (36.7%). However, GPT-3.5 and CLOVA X did not show significant differences in their overallperformance. Additionally, the GPT-4 showed superior performance on questions translated into English, indicating a languageprocessing discrepancy (English: 75.4% vs Korean: 67.8%; difference 7.5%; 95% CI 3.1%-11.9%; P=.001). Conclusions: This study underscores the potential of AI tools, such as ChatGPT, in medical education and practice but emphasizesthe need for cautious application and further refinement, especially in non-English medical contexts. The findings suggest thatalthough AI advancements are promising, they require careful evaluation and development to ensure acceptable performanceacross diverse linguistic and professional settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] DO UNITED STATES MEDICAL LICENSING EXAMINATION (USMLE) SCORES PREDICT IN-TRAINING TEST PERFORMANCE FOR EMERGENCY MEDICINE RESIDENTS?
    Thundiyil, Josef G.
    Modica, Renee F.
    Silvestri, Salvatore
    Papa, Linda
    JOURNAL OF EMERGENCY MEDICINE, 2010, 38 (01): : 65 - 69
  • [32] The effect of a toxicology standardized curriculum on toxicology section In-Training Examination scores of emergency medicine residents
    Boyd, Molly
    CLINICAL TOXICOLOGY, 2017, 55 (07) : 809 - 809
  • [33] Musculoskeletal Knowledge on the in-Training Examination Improves in Family Medicine Residents Participating in a Longitudinal Sports Medicine Clinical Track
    Furr, Micah
    Tumin, Dmitry
    Ferderber, Megan L.
    JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT, 2024, 11
  • [34] Factors Predictive of Orthopaedic In-training Examination Performance and Research Productivity Among Orthopaedic Residents
    Kreitz, Tyler
    Verma, Satyendra
    Adan, Alexei
    Verma, Kushagra
    JOURNAL OF THE AMERICAN ACADEMY OF ORTHOPAEDIC SURGEONS, 2019, 27 (06) : E286 - E292
  • [35] Performance of US and international medical graduates on the 1995 Internal Medicine In-Training Examination
    Waxman, HS
    Garibaldi, RA
    Subhiyah, RG
    ANNALS OF INTERNAL MEDICINE, 1996, 125 (02) : 158 - 158
  • [36] Validation of the General Medicine in-Training Examination Using the Professional and Linguistic Assessments Board Examination Among Postgraduate Residents in Japan
    Nagasaki, Kazuya
    Nishizaki, Yuji
    Nojima, Masanori
    Shimizu, Taro
    Konishi, Ryota
    Okubo, Tomoya
    Yamamoto, Yu
    Morishima, Ryo
    Kobayashi, Hiroyuki
    Tokuda, Yasuharu
    INTERNATIONAL JOURNAL OF GENERAL MEDICINE, 2021, 14 : 6487 - 6495
  • [37] Impact of Question Bank Use for In-Training Examination Preparation by OBGYN Residents - A Multicenter Study
    Green, Isabel
    Weaver, Amy
    Kircher, Samantha
    Levy, Gary
    Brady, Robert Michael
    Flicker, Amanda B.
    Gala, Rajiv B.
    Peterson, Joseph
    Decesare, Julie
    Breitkopf, Daniel
    JOURNAL OF SURGICAL EDUCATION, 2022, 79 (03) : 775 - 782
  • [38] Reading Habits of General Surgery Residents and Association With American Board of Surgery In-Training Examination Performance
    Kim, Jerry J.
    Kim, Dennis Y.
    Kaji, Amy H.
    Gifford, Edward D.
    Reid, Christopher
    Sidwell, Richard A.
    Reeves, Mark E.
    Hartranft, Thomas H.
    Inaba, Kenji
    Jarman, Benjamin T.
    Are, Chandrakanth
    Galante, Joseph M.
    Amersi, Farin
    Smith, Brian R.
    Melcher, Marc L.
    Nelson, Timothy
    Donahue, Timothy
    Jacobsen, Garth
    Arnell, Tracey D.
    de Virgilio, Christian
    JAMA SURGERY, 2015, 150 (09) : 882 - 889
  • [39] Does Correlation of Faculty Assessment of Emergency Medicine Residents' Medical Knowledge Competency With Performance on the In-Training Examination Improve With Advancement Through the Program?
    Barlas, D.
    Ryan, J. G.
    ANNALS OF EMERGENCY MEDICINE, 2009, 54 (03) : S33 - S33
  • [40] Does the Preferred Study Source Impact Orthopedic In-Training Examination Performance?
    Theismann, Jeffrey J.
    Solberg, Erik J.
    Agel, Julie
    Dyer, George S.
    Egol, Kenneth A.
    Israelite, Craig L.
    Karam, Matthew D.
    Kim, Hubert
    Klein, Sandra E.
    Kweon, Christopher Y.
    LaPorte, Dawn M.
    Van Heest, Ann
    JOURNAL OF SURGICAL EDUCATION, 2022, 79 (01) : 266 - 273