Digesting Digital Health: A Study of Appropriateness and Readability of ChatGPT-Generated Gastroenterological Information

被引:1
|
作者
Toiv, Avi [1 ]
Saleh, Zachary [2 ]
Ishak, Angela [1 ]
Alsheik, Eva [2 ]
Venkat, Deepak [2 ]
Nandi, Neilanjan [3 ]
Zuchelli, Tobias E. [2 ]
机构
[1] Henry Ford Hosp, Dept Internal Med, Detroit, MI USA
[2] Henry Ford Hosp, Div Gastroenterol & Hepatol, Detroit, MI USA
[3] Univ Penn, Div Gastroenterol & Hepatol, Philadelphia, PA 19104 USA
关键词
natural language processing; AI; artificial intelligence; medical terminology; gastroenterology; EDUCATION MATERIALS; QUALITY;
D O I
10.14309/ctg.0000000000000765
中图分类号
R57 [消化系及腹部疾病];
学科分类号
摘要
INTRODUCTION:The advent of artificial intelligence-powered large language models capable of generating interactive responses to intricate queries marks a groundbreaking development in how patients access medical information. Our aim was to evaluate the appropriateness and readability of gastroenterological information generated by Chat Generative Pretrained Transformer (ChatGPT).METHODS:We analyzed responses generated by ChatGPT to 16 dialog-based queries assessing symptoms and treatments for gastrointestinal conditions and 13 definition-based queries on prevalent topics in gastroenterology. Three board-certified gastroenterologists evaluated output appropriateness with a 5-point Likert-scale proxy measurement of currency, relevance, accuracy, comprehensiveness, clarity, and urgency/next steps. Outputs with a score of 4 or 5 in all 6 categories were designated as "appropriate." Output readability was assessed with Flesch Reading Ease score, Flesch-Kinkaid Reading Level, and Simple Measure of Gobbledygook scores.RESULTS:ChatGPT responses to 44% of the 16 dialog-based and 69% of the 13 definition-based questions were deemed appropriate, and the proportion of appropriate responses within the 2 groups of questions was not significantly different (P = 0.17). Notably, none of ChatGPT's responses to questions related to gastrointestinal emergencies were designated appropriate. The mean readability scores showed that outputs were written at a college-level reading proficiency.DISCUSSION:ChatGPT can produce generally fitting responses to gastroenterological medical queries, but responses were constrained in appropriateness and readability, which limits the current utility of this large language model. Substantial development is essential before these models can be unequivocally endorsed as reliable sources of medical information.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] AI vs. Human-Authored Headlines: Evaluating the Effectiveness, Trust, and Linguistic Features of ChatGPT-Generated Clickbait and Informative Headlines in Digital News
    Gherhes, Vasile
    Farcasiu, Marcela Alina
    Cernicova-Buca, Mariana
    Coman, Claudiu
    INFORMATION, 2025, 16 (02)
  • [22] Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study
    Cheng, Shu-Li
    Tsai, Shih-Jen
    Bai, Ya-Mei
    Ko, Chih-Hung
    Hsu, Chih-Wei
    Yang, Fu-Chi
    Tsai, Chia-Kuang
    Tu, Yu-Kang
    Yang, Szu-Nian
    Tseng, Ping-Tao
    Hsu, Tien-Wei
    Liang, Chih-Sung
    Su, Kuan-Pin
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [23] Digital Health Literacy: Evaluating the Readability and Reliability of Cochlear Implant Patient Information on the Web
    Vishak MS
    Adwaith Krishna Surendran
    Nandini B Krishnan
    Kalaiarasi Raja
    Indian Journal of Otolaryngology and Head & Neck Surgery, 2024, 76 : 987 - 991
  • [24] Digital Health Literacy: Evaluating the Readability and Reliability of Cochlear Implant Patient Information on the Web
    Vishak, M. S.
    Surendran, Adwaith Krishna
    Krishnan, Nandini B.
    Raja, Kalaiarasi
    INDIAN JOURNAL OF OTOLARYNGOLOGY AND HEAD & NECK SURGERY, 2024, 76 (01) : 987 - 991
  • [25] Deep learning in digital health with chatgpt: a study on efficient code generation
    Loh, B. C. S.
    Fong, A. Y. Y.
    Ong, T. K.
    Then, P. H. H.
    EUROPEAN HEART JOURNAL, 2023, 44
  • [26] ChatGPT and the Future of Digital Health: A Study on Healthcare Workers' Perceptions and Expectations
    Temsah, Mohamad-Hani
    Aljamaan, Fadi
    Malki, Khalid H.
    Alhasan, Khalid
    Altamimi, Ibraheem
    Aljarbou, Razan
    Bazuhair, Faisal
    Alsubaihin, Abdulmajeed
    Abdulmajeed, Naif
    Alshahrani, Fatimah S.
    Temsah, Reem
    Alshahrani, Turki
    Al-Eyadhy, Lama
    Alkhateeb, Serin Mohammed
    Saddik, Basema
    Halwani, Rabih
    Jamal, Amr
    Al-Tawfiq, Jaffar A.
    Al-Eyadhy, Ayman
    HEALTHCARE, 2023, 11 (13)
  • [27] Digital health information: case study the information kiosk
    Nicholas, D
    Williams, P
    Huntington, P
    ASLIB PROCEEDINGS, 2000, 52 (09): : 315 - 330
  • [28] ChatGPT's Ability to Assess Quality and Readability of Online Medical Information: Evidence From a Cross-Sectional Study
    Golan, Roei
    Ripps, Sarah J.
    Reddy, Raghuram
    Loloi, Justin
    Bernstein, Ari P.
    Connelly, Zachary M.
    Golan, Noa S.
    Ramasamy, Ranjith
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (07)
  • [29] Efficient Health Information Management Based on Patient-Generated Digital Data
    Psiha, Maria M.
    GENEDIS 2016: GERIATRICS, 2017, 989 : 271 - 280
  • [30] Re: Momenaei et al.: Appropriateness and readability of ChatGPT-4-generated responses for surgical treatment of retinal diseases (Ophthalmol Retina. 2023:7:862-868
    Bommakanti, Nikhil
    Caranfa, Jonathan T.
    Young, Benjamin K.
    Zhao, Peter Y.
    OPHTHALMOLOGY RETINA, 2024, 8 (01): : E1 - E1