Evaluation of Responses to Questions About Keratoconus Using ChatGPT-4.0, Google Gemini and Microsoft Copilot: A Comparative Study of Large Language Models on Keratoconus

被引:0
|
作者
Demir, Suleyman [1 ]
机构
[1] Adana 5 Ocak State Hosp, Dept Ophthalmol, Adana, Turkiye
来源
关键词
ChatGPT-4.0; Google Gemini; Microsoft Copilot; Artificial intelligence; Keratoconus; INFLAMMATORY MOLECULES; CORNEAL;
D O I
10.1097/ICL.0000000000001158
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Objectives: Large language models (LLMs) are increasingly being used today and are becoming increasingly important for providing accurate clinical information to patients and physicians. This study aimed to evaluate the effectiveness of generative pre-trained transforme-4.0 (ChatGPT-4.0), Google Gemini, and Microsoft Copilot LLMs in responding to patient questions regarding keratoconus. Methods: The LLMs' responses to the 25 most common questions about keratoconus asked by real-life patients were blindly rated by two ophthalmologists using a 5-point Likert scale. In addition, the DISCERN scale was used to evaluate the responses of the language models in terms of reliability, and the Flesch reading ease and Flesch-Kincaid grade level indices were used to determine readability. Results: ChatGPT-4.0 provided more detailed and accurate answers to patients' questions about keratoconus than Google Gemini and Microsoft Copilot, with 92% of the answers belonging to the "agree" or "strongly agree" categories. Significant differences were observed between all three LLMs on the Likert scale (P<0.001). Conclusions: Although the answers of ChatGPT-4.0 to questions about keratoconus were more complex for patients than those of other language programs, the information provided was reliable and accurate.
引用
收藏
页码:e107 / e111
页数:5
相关论文
共 29 条
  • [1] A Performance Evaluation of Large Language Models in Keratoconus: A Comparative Study of ChatGPT-3.5, ChatGPT-4.0, Gemini, Copilot, Chatsonic, and Perplexity
    Reyhan, Ali Hakim
    Mutaf, Cagri
    Uzun, Irfan
    Yuksekyayla, Funda
    JOURNAL OF CLINICAL MEDICINE, 2024, 13 (21)
  • [2] Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3
    Zhao, Fang-Fang
    He, Han-Jie
    Liang, Jia-Jian
    Cen, Jingyun
    Wang, Yun
    Lin, Hongjie
    Chen, Feifei
    Li, Tai-Ping
    Yang, Jian-Feng
    Chen, Lan
    Cen, Ling-Ping
    EYE, 2024,
  • [3] Assessing the Responses of Large Language Models (ChatGPT-4, Gemini, and Microsoft Copilot) to Frequently Asked Questions in Breast Imaging: A Study on Readability and Accuracy
    Tepe, Murat
    Emekli, Emre
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (05)
  • [4] Comment on: "Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3"
    Luo, Xiao
    Tang, Cheng
    Chen, Jin-Jin
    Yuan, Jin
    Huang, Jin-Jin
    Yan, Tao
    EYE, 2025,
  • [5] Reply to 'Comment on: Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3'
    Zhao, Fang-Fang
    He, Han-Jie
    Liang, Jia-Jian
    Cen, Ling-Ping
    EYE, 2025,
  • [6] Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard
    Lim, Zhi Wei
    Pushpanathan, Krithi
    Yew, Samantha Min Er
    Lai, Yien
    Sun, Chen-Hsin
    Lam, Janice Sing Harn
    Chen, David Ziyou
    Goh, Jocelyn Hui Lin
    Tan, Marcus Chun Jin
    Sheng, Bin
    Cheng, Ching-Yu
    Koh, Victor Teck Chang
    Tham, Yih-Chung
    EBIOMEDICINE, 2023, 95
  • [7] Comparative Evaluation of AI Models Such as ChatGPT 3.5, ChatGPT 4.0, and Google Gemini in Neuroradiology Diagnostics
    Gupta, Rishi
    Hamid, Abdullgabbar M.
    Jhaveri, Miral
    Patel, Niki
    Suthar, Pokhraj P.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (08)
  • [8] Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis
    Mavrych, Volodymyr
    Ganguly, Paul
    Bolgova, Olena
    CLINICAL ANATOMY, 2025, 38 (02) : 200 - 210
  • [9] Performance evaluation of ChatGPT-4.0 and Gemini on image-based neurosurgery board practice questions: A comparative analysis
    Mcnulty, Alana M.
    Valluri, Harshitha
    Gajjar, Avi A.
    Custozzo, Amanda
    Field, Nicholas C.
    Paul, Alexandra R.
    JOURNAL OF CLINICAL NEUROSCIENCE, 2025, 134
  • [10] Evaluating the reliability of the responses of large language models to keratoconus-related questions
    Kayabasi, Mustafa
    Koksaldi, Seher
    Engin, Ceren Durmaz
    CLINICAL AND EXPERIMENTAL OPTOMETRY, 2024,