ChatGPT versus expert feedback on clinical reasoning questions and their effect on learning: a randomized controlled trial

被引:2
|
作者
Cicek, Feray Ekin [1 ]
Ulker, Muserref [1 ]
Ozer, Menekse [1 ]
Kiyak, Yavuz Selim [2 ]
机构
[1] Gazi Univ, Fac Med, TR-06500 Ankara, Turkiye
[2] Gazi Univ, Fac Med, Dept Med Educ & Informat, TR-06500 Ankara, Turkiye
关键词
ChatGPT; large language models; artificial intelligence; feedback; clinical reasoning;
D O I
10.1093/postmj/qgae170
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Purpose To evaluate the effectiveness of ChatGPT-generated feedback compared to expert-written feedback in improving clinical reasoning skills among first-year medical students. Methods This is a randomized controlled trial conducted at a single medical school and involved 129 first-year medical students who were randomly assigned to two groups. Both groups completed three formative tests with feedback on urinary tract infections (UTIs; uncomplicated, complicated, pyelonephritis) over five consecutive days as a spaced repetition, receiving either expert-written feedback (control, n = 65) or ChatGPT-generated feedback (experiment, n = 64). Clinical reasoning skills were assessed using Key-Features Questions (KFQs) immediately after the intervention and 10 days later. Students' critical approach to artificial intelligence (AI) was also measured before and after disclosing the AI involvement in feedback generation. Results There was no significant difference between the mean scores of the control (immediate: 78.5 +/- 20.6 delayed: 78.0 +/- 21.2) and experiment (immediate: 74.7 +/- 15.1, delayed: 76.0 +/- 14.5) groups in overall performance on Key-Features Questions (out of 120 points) immediately (P = .26) or after 10 days (P = .57), with small effect sizes. However, the control group outperformed the ChatGPT group in complicated urinary tract infection cases (P < .001). The experiment group showed a significantly higher critical approach to AI after disclosing, with medium-large effect sizes. Conclusions ChatGPT-generated feedback can be an effective alternative to expert feedback in improving clinical reasoning skills in medical students, particularly in resource-constrained settings with limited expert availability. However, AI-generated feedback may lack the nuance needed for more complex cases, emphasizing the need for expert review. Additionally, exposure to the drawbacks in AI-generated feedback can enhance students' critical approach towards AI-generated educational content. Key Messages What is already known on this topic Text-based virtual patients with feedback have shown effectiveness in improving clinical reasoning, and recent advances in generative artificial intelligence (AI), such as ChatGPT, have proposed new ways to provide feedback in medical education. However, the effect of AI-generated feedback has not been compared to expert-written feedback. What this study adds While the effect of ChatGPT feedback was generally on par with the effect of expert feedback, the study identified limitations in AI-generated explanations for more nuanced diagnosis and treatment. How this study might affect research, practice, or policy The findings suggest that ChatGPT can be utilized as a supplementary tool especially in resource-limited settings where expert feedback is not readily available. Its integration could streamline feedback and improve educational efficiency, but a hybrid approach is recommended to ensure accuracy, with educators reviewing AI-generated feedback.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Teaching clinical reasoning to undergraduate medical students by illness script method: a randomized controlled trial
    Mana Moghadami
    Mitra Amini
    Mohsen Moghadami
    Bhavin Dalal
    Bernard Charlin
    BMC Medical Education, 21
  • [42] Individualised Expert Feedback is Not Essential for Improving Basic Clinical Skills Performance in Novice Learners: A Randomized Trial
    Phillips, Alexander W.
    Matthan, Joanna
    Bookless, Lucy R.
    Whitehead, Ian J.
    Madhavan, Anantha
    Rodham, Paul
    Porter, Anna L. R.
    Nesbitt, Craig I.
    Stansby, Gerard
    JOURNAL OF SURGICAL EDUCATION, 2017, 74 (04) : 612 - 620
  • [43] Bonded versus banded first molar attachments: a randomized controlled clinical trial
    Banks, Philip
    Macfarlane, Tatiana V.
    JOURNAL OF ORTHODONTICS, 2007, 34 (02) : 128 - 136
  • [44] Acupressure versus NSAID for relief of orthodontic pain A randomized controlled clinical trial
    Elshehaby, Moataz
    Tawfik, Marwa Ali
    Montasser, Mona A.
    JOURNAL OF OROFACIAL ORTHOPEDICS-FORTSCHRITTE DER KIEFERORTHOPADIE, 2025, 86 (01): : 24 - 32
  • [45] Electroacupuncture versus Sham Acupuncture for Perimenopausal Insomnia: A Randomized Controlled Clinical Trial
    Li, Shanshan
    Wang, Zhaoqin
    Wu, Huangan
    Yue, Hongyu
    Yin, Ping
    Zhang, Wei
    Lao, Lixing
    Mi, Yiqun
    Xu, Shifen
    NATURE AND SCIENCE OF SLEEP, 2020, 12 : 1201 - 1213
  • [46] Expert Patient Self-Management Program Versus Usual Care in Bronchiectasis: A Randomized Controlled Trial
    Lavery, Katherine A.
    O'Neill, Brenda
    Parker, Michael
    Elborn, J. Stuart
    Bradley, Judy M.
    ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 2011, 92 (08): : 1194 - 1201
  • [47] Effect of Han Hepa® on the Improvement of the Hepatic Function: A Controlled, Double Blind, Randomized versus Placebo Clinical Trial
    Allaert, Francois
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2013, 108 : S108 - S108
  • [48] Clinical Ultrasound Education for Medical Students Virtual Reality Versus e-Learning, a Randomized Controlled Pilot Trial
    Nielsen, Mathias Rosenfeldt
    Kristensen, Erik Qvist
    Jensen, Rune Overgaard
    Mollerup, Anne Milther
    Pfeiffer, Thorbjorn
    Graumann, Ole
    ULTRASOUND QUARTERLY, 2021, 37 (03) : 292 - 296
  • [49] The effect of smartphone-based monitoring and treatment including clinical feedback versus smartphone-based monitoring without clinical feedback in bipolar disorder: the SmartBipolar trial-a study protocol for a randomized controlled parallel-group trial
    Faurholt-Jepsen, Maria
    Kyster, Natacha Blauenfeldt
    Dyreholt, Malene Schwarz
    Christensen, Ellen Margrethe
    Bondo-Kozuch, Pernille
    Lerche, Anna Skovgaard
    Smidt, Birte
    Knorr, Ulla
    Brondmark, Kim
    Cardoso, Anne-Marie Bangsgaard
    Mathiesen, Anja
    Sjaelland, Rene
    Norbak-Emig, Henrik
    Sponsor, Lotte Linnemann
    Mardosas, Darius
    Sarauw-Nielsen, Ida Palmblad
    Bukh, Jens Drachmann
    Heller, Trine Vogg
    Frost, Mads
    Iversen, Nanna
    Bardram, Jakob Eyvind
    Busk, Jonas
    Vinberg, Maj
    Kessing, Lars Vedel
    TRIALS, 2023, 24 (01)
  • [50] The effect of interprofessional simulation practice on collaborative learning: A randomized controlled trial
    Marion, Alexandra Daniela Costa
    Pereira, Leonardo Costa
    Pinho, Dian Lucia Moura
    JOURNAL OF INTERPROFESSIONAL CARE, 2025, 39 (01) : 14 - 21