Large language models: a new frontier in paediatric cataract patient education

被引:3
|
作者
Dihan, Qais [1 ,2 ]
Chauhan, Muhammad Z. [2 ]
Eleiwa, Taher K. [3 ]
Brown, Andrew D. [4 ]
Hassan, Amr K. [5 ]
Khodeiry, Mohamed M. [6 ]
Elsheikh, Reem H. [2 ]
Oke, Isdin [7 ]
Nihalani, Bharti R. [7 ]
VanderVeen, Deborah K. [7 ]
Sallam, Ahmed B. [2 ]
Elhusseiny, Abdelrahman M. [2 ,7 ]
机构
[1] Rosalind Franklin Univ Med & Sci, Chicago Med Sch, N Chicago, IL USA
[2] Univ Arkansas Med Sci, Dept Ophthalmol, Little Rock, AR 72205 USA
[3] Benha Univ, Dept Ophthalmol, Banha, Egypt
[4] Univ Arkansas Med Sci, Little Rock, AR USA
[5] South Valley Univ, Dept Ophthalmol, Qena, Egypt
[6] Univ Kentucky, Dept Ophthalmol, Lexington, KY USA
[7] Harvard Med Sch, Boston Childrens Hosp, Dept Ophthalmol, Boston, MA 02115 USA
关键词
Medical Education; Public health; Epidemiology; Child health (paediatrics); CHILDHOOD; READABILITY; INFORMATION; QUALITY; HEALTH;
D O I
10.1136/bjo-2024-325252
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Background/aims This was a cross-sectional comparative study. We evaluated the ability of three large language models (LLMs) (ChatGPT-3.5, ChatGPT-4, and Google Bard) to generate novel patient education materials (PEMs) and improve the readability of existing PEMs on paediatric cataract. Methods We compared LLMs' responses to three prompts. Prompt A requested they write a handout on paediatric cataract that was 'easily understandable by an average American.' Prompt B modified prompt A and requested the handout be written at a 'sixth-grade reading level, using the Simple Measure of Gobbledygook (SMOG) readability formula.' Prompt C rewrote existing PEMs on paediatric cataract 'to a sixth-grade reading level using the SMOG readability formula'. Responses were compared on their quality (DISCERN; 1 (low quality) to 5 (high quality)), understandability and actionability (Patient Education Materials Assessment Tool (>= 70%: understandable, >= 70%: actionable)), accuracy (Likert misinformation; 1 (no misinformation) to 5 (high misinformation) and readability (SMOG, Flesch-Kincaid Grade Level (FKGL); grade level <7: highly readable). Results All LLM-generated responses were of high-quality (median DISCERN >= 4), understandability (>= 70%), and accuracy (Likert=1). All LLM-generated responses were not actionable (<70%). ChatGPT-3.5 and ChatGPT-4 prompt B responses were more readable than prompt A responses (p<0.001). ChatGPT-4 generated more readable responses (lower SMOG and FKGL scores; 5.59 +/- 0.5 and 4.31 +/- 0.7, respectively) than the other two LLMs (p<0.001) and consistently rewrote them to or below the specified sixth-grade reading level (SMOG: 5.14 +/- 0.3). Conclusion LLMs, particularly ChatGPT-4, proved valuable in generating high-quality, readable, accurate PEMs and in improving the readability of existing materials on paediatric cataract.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] ChatGPT for good? On opportunities and challenges of large language models for education
    Kasneci, Enkelejda
    Sessler, Kathrin
    Kuechemann, Stefan
    Bannert, Maria
    Dementieva, Daryna
    Fischer, Frank
    Gasser, Urs
    Groh, Georg
    Guennemann, Stephan
    Huellermeier, Eyke
    Krusche, Stepha
    Kutyniok, Gitta
    Michaeli, Tilman
    Nerdel, Claudia
    Pfeffer, Juergen
    Poquet, Oleksandra
    Sailer, Michael
    Schmidt, Albrecht
    Seidel, Tina
    Stadler, Matthias
    Weller, Jochen
    Kuhn, Jochen
    Kasneci, Gjergji
    LEARNING AND INDIVIDUAL DIFFERENCES, 2023, 103
  • [42] The potential of Large Language Models for social robots in special education
    Voultsiou, Evdokia
    Vrochidou, Eleni
    Moussiades, Lefteris
    Papakostas, George A.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2025,
  • [43] Impact of Large Language Models on Medical Education andTeaching Adaptations
    Li, Zhui
    Yhap, Nina
    Liu, Liping
    Wang, Zhengjie
    Xiong, Zhonghao
    Yuan, Xiaoshu
    Cui, Hong
    Liu, Xuexiu
    Ren, Wei
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [44] The Role of Large Language Models in Medical Education: Applications and Implications
    Safranek, Conrad W.
    Sidamon-Eristoff, Anne Elizabeth
    Gilson, Aidan
    Chartash, David
    JMIR MEDICAL EDUCATION, 2023, 9
  • [45] Challenges and Opportunities of Moderating Usage of Large Language Models in Education
    Krupp, Lars
    Steinert, Steffen
    Kiefer-Emmanouilidis, Maximilian
    Avila, Karina E.
    Lukowicz, Paul
    Kuhn, Jochen
    Kuechemann, Stefan
    Karolus, Jakob
    AI FOR EDUCATION WORKSHOP, 2024, 257 : 9 - 17
  • [46] Comparison of Frontier Open-Source and Proprietary Large Language Models for Complex Diagnoses
    Buckley, Thomas A.
    Crowe, Byron
    Abdulnour, Raja-Elie E.
    Rodman, Adam
    Manrai, Arjun K.
    JAMA HEALTH FORUM, 2025, 6 (03):
  • [47] The Triage and Diagnostic Accuracy of Frontier Large Language Models: Updated Comparison to Physician Performance
    Sorich, Michael Joseph
    Mangoni, Arduino Aleksander
    Bacchi, Stephen
    Menz, Bradley Douglas
    Hopkins, Ashley Mark
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [48] New frontier in language teaching
    Framil, ANP
    LANGUAGES IN THE EUROPEAN COMMUNITY II - TEACHING OF SECOND LANGUAGES AND/OR FOREIGN LANGUAGES, 1996, (18): : 299 - 312
  • [49] Large Language Models Improve Readability in Primary Responses to Coronary Artery Bypass Graft Questions for Patient Education
    Paruchuri, Venkata
    Patel, Dhaval
    Fang, Alexander
    Levitch, Ethan
    Kiely, Conor
    Kaur, Karmveer
    Sikder, Amit
    CIRCULATION, 2024, 150
  • [50] Evaluation of Patient Education Materials From Large-Language Artificial Intelligence Models on Carpal Tunnel Release
    Croen, Brett J.
    Abdullah, Mohammed S.
    Berns, Ellis
    Rapaport, Sarah
    Hahn, Alexander K.
    Barrett, Caitlin C.
    Sobel, Andrew D.
    HAND-AMERICAN ASSOCIATION FOR HAND SURGERY, 2024,