Digesting Digital Health: A Study of Appropriateness and Readability of ChatGPT-Generated Gastroenterological Information

被引:1
|
作者
Toiv, Avi [1 ]
Saleh, Zachary [2 ]
Ishak, Angela [1 ]
Alsheik, Eva [2 ]
Venkat, Deepak [2 ]
Nandi, Neilanjan [3 ]
Zuchelli, Tobias E. [2 ]
机构
[1] Henry Ford Hosp, Dept Internal Med, Detroit, MI USA
[2] Henry Ford Hosp, Div Gastroenterol & Hepatol, Detroit, MI USA
[3] Univ Penn, Div Gastroenterol & Hepatol, Philadelphia, PA 19104 USA
关键词
natural language processing; AI; artificial intelligence; medical terminology; gastroenterology; EDUCATION MATERIALS; QUALITY;
D O I
10.14309/ctg.0000000000000765
中图分类号
R57 [消化系及腹部疾病];
学科分类号
摘要
INTRODUCTION:The advent of artificial intelligence-powered large language models capable of generating interactive responses to intricate queries marks a groundbreaking development in how patients access medical information. Our aim was to evaluate the appropriateness and readability of gastroenterological information generated by Chat Generative Pretrained Transformer (ChatGPT).METHODS:We analyzed responses generated by ChatGPT to 16 dialog-based queries assessing symptoms and treatments for gastrointestinal conditions and 13 definition-based queries on prevalent topics in gastroenterology. Three board-certified gastroenterologists evaluated output appropriateness with a 5-point Likert-scale proxy measurement of currency, relevance, accuracy, comprehensiveness, clarity, and urgency/next steps. Outputs with a score of 4 or 5 in all 6 categories were designated as "appropriate." Output readability was assessed with Flesch Reading Ease score, Flesch-Kinkaid Reading Level, and Simple Measure of Gobbledygook scores.RESULTS:ChatGPT responses to 44% of the 16 dialog-based and 69% of the 13 definition-based questions were deemed appropriate, and the proportion of appropriate responses within the 2 groups of questions was not significantly different (P = 0.17). Notably, none of ChatGPT's responses to questions related to gastrointestinal emergencies were designated appropriate. The mean readability scores showed that outputs were written at a college-level reading proficiency.DISCUSSION:ChatGPT can produce generally fitting responses to gastroenterological medical queries, but responses were constrained in appropriateness and readability, which limits the current utility of this large language model. Substantial development is essential before these models can be unequivocally endorsed as reliable sources of medical information.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Evaluation of the accuracy and readability of ChatGPT-4 and Google Gemini in providing information on retinal detachment: a multicenter expert comparative study
    Strzalkowski, Piotr
    Strzalkowska, Alicja
    Chhablani, Jay
    Pfau, Kristina
    Errera, Marie-Helene
    Roth, Mathias
    Schaub, Friederike
    Bechrakis, Nikolaos E.
    Hoerauf, Hans
    Reiter, Constantin
    Schuster, Alexander K.
    Geerling, Gerd
    Guthoff, Rainer
    INTERNATIONAL JOURNAL OF RETINA AND VITREOUS, 2024, 10 (01)
  • [32] OpenAI ChatGPT generated content and similarity index: A study of selected terms from the library & information science
    Patra, Swapan Kumar
    Kirtania, Deep Kumar
    ANNALS OF LIBRARY AND INFORMATION STUDIES, 2023, 70 (02) : 99 - 101
  • [33] Searching intention and information outcome: A case study of digital health information
    Nicholas, D
    Huntington, P
    Williams, P
    LIBRI, 2001, 51 (03): : 157 - 166
  • [34] Assessment of the Quality and Readability of Web-Based Arabic Health Information on Halitosis: Infodemiological Study
    Aboalshamat, Khalid
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [35] Mental Health Information in the Digital Environment: A Case Study in Argentina
    Zunino, Esteban Andres
    PALABRA CLAVE, 2024, 27 (04)
  • [36] Quality and readability of web-based Arabic health information on COVID-19: an infodemiological study
    Esam Halboub
    Mohammed Sultan Al-Ak’hali
    Hesham M. Al-Mekhlafi
    Mohammed Nasser Alhajj
    BMC Public Health, 21
  • [37] Quality and readability of web-based Arabic health information on COVID-19: an infodemiological study
    Halboub, Esam
    Al-Ak'hali, Mohammed Sultan
    Al-Mekhlafi, Hesham M.
    Alhajj, Mohammed Nasser
    BMC PUBLIC HEALTH, 2021, 21 (01)
  • [38] Assessing Readability of Patient Education Materials: A Comparative Study of ASRS Resources and AI-Generated Content by Popular Large Language Models (ChatGPT 4.0 and Google Bard)
    Shi, Michael
    Hanna, Jovana
    Clavell, Christine
    Eid, Kevin
    Eid, Alen
    Ghorayeb, Ghassan
    John Nguyen
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
  • [39] End-of-life Care Patient Information Leaflets-A Comparative Evaluation of Artificial Intelligence-generated Content for Readability, Sentiment, Accuracy, Completeness, and Suitability: ChatGPT vs Google Gemini
    Gondode, Prakash G.
    Khanna, Puneet
    Sharma, Pradeep
    Duggal, Sakshi
    Garg, Neha
    INDIAN JOURNAL OF CRITICAL CARE MEDICINE, 2024, 28 (06) : 561 - 568
  • [40] Assessing the readability and patient comprehension of rheumatology medicine information sheets: a cross-sectional Health Literacy Study
    Oliffe, Michael
    Thompson, Emma
    Johnston, Jenny
    Freeman, Dianne
    Bagga, Hanish
    Wong, Peter K. K.
    BMJ OPEN, 2019, 9 (02):