Evaluating ChatGPT-3.5 and ChatGPT-4.0 Responses on Hyperlipidemia for Patient Education

被引：10

作者：

Lee, Thomas J. ^{[1
]}

Rao, Abhinav K. ^{[2
]}

Campbell, Daniel J. ^{[3
]}

Radfar, Navid ^{[1
]}

Dayal, Manik ^{[1
]}

Khrais, Ayham ^{[1
]}

机构：

[1] Rutgers Univ New Jersey, Med Sch, Dept Med, Newark, NJ 07103 USA

[2] Trident Med Ctr, Dept Med, Charleston, SC USA

[3] Thomas Jefferson Univ Hosp, Dept Otolaryngol Head & Neck Surg, Philadelphia, PA USA

来源：

CUREUS JOURNAL OF MEDICAL SCIENCE | 2024年 / 16卷 / 05期

关键词：

arrhythmia; patient education; chatgpt; atrial fibrillation; artificial intelligence;

D O I：

10.7759/cureus.61067

中图分类号：

R5 [内科学];

学科分类号：

1002 ; 100201 ;

摘要：

Introduction Hyperlipidemia is prevalent worldwide and affects a significant number of US adults. It significantly contributes to ischemic heart disease and millions of deaths annually. With the increasing use of the internet for health information, tools like ChatGPT (OpenAI, San Francisco, CA, USA) have gained traction. ChatGPT version 4.0, launched in March 2023, offers enhanced features over its predecessor but requires a monthly fee. This study compares the accuracy, comprehensibility, and response length of the free and paid versions of ChatGPT for patient education on hyperlipidemia. Materials and methods ChatGPT versions 3.5 and 4.0 were prompted in three different ways and 25 questions from the Cleveland Clinic's frequently asked questions (FAQs) on hyperlipidemia. Prompts included no prompting (Form 1), patient -friendly prompting (Form 2), and physician -level prompting (Form 3). Responses were categorized as incorrect, partially correct, or correct. Additionally, the grade level and word count from each response were recorded for analysis. Results Overall, scoring frequencies for ChatGPT version 3.5 were: five (6.67%) incorrect, 18 partially correct (24%), and 52 (69.33%) correct. Scoring frequencies for ChatGPT version 4.0 were: one (1.33%) incorrect, 18 (24.00%) partially correct, and 56 (74.67%) correct. Correct answers did not significantly differ between ChatGPT version 3.5 and ChatGPT version 4.0 (p = 0.586). ChatGPT version 3.5 had a significantly higher grade reading level than version 4.0 (p = 0.0002). ChatGPT version 3.5 had a significantly higher word count than version 4.0 (p = 0.0073). Discussion There was no significant difference in accuracy between the free and paid versions of hyperlipidemia FAQs. Both versions provided accurate but sometimes partially complete responses. Version 4.0 offered more concise and readable information, aligning with the readability of most online medical resources despite exceeding the National Institutes of Health's (NIH's) recommended eighth -grade reading level. The paid version demonstrated superior adaptability in tailoring responses based on the input. Conclusion Both versions of ChatGPT provide reliable medical information, with the paid version offering more adaptable and readable responses. Healthcare providers can recommend ChatGPT as a source of patient education, regardless of the version used. Future research should explore diverse question formulations and ChatGPT's handling of incorrect information.

引用

页数：7

共 50 条

[21] Evaluation of the prediagnosis and management of ChatGPT-4.0 in clinical cases in cardiology
Yavuz, Yunus Emre
Kahraman, Fatih
FUTURE CARDIOLOGY, 2024, 20 (04) : 197 - 207
[22] ChatGPT-3.5 and-4.0 and mechanical engineering: Examining performance on the FE mechanical engineering and undergraduate exams
Frenkel, Matthew E.
Emara, Hebah
COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2024, 32 (06)
[23] Medication counseling for OTC drugs using customized ChatGPT-4: Comparison with ChatGPT-3.5 and ChatGPT-4o
Kiyomiya, Keisuke
Aomori, Tohru
Ohtani, Hisakazu
DIGITAL HEALTH, 2025, 11
[24] An empirical study of ChatGPT-3.5 on question answering and code maintenance
Kabir, Md Mahir Asef
Hassan, Sk Adnan
Wang, Xiaoyin
Wang, Ying
Yu, Hai
Meng, Na
arXiv, 2023,
[25] Readability and Appropriateness of Responses Generated by ChatGPT 3.5, ChatGPT 4.0, Gemini, and Microsoft Copilot for FAQs in Refractive Surgery
Aydin, Fahri Onur
Aksoy, Burakhan Kursat
Ceylan, Ali
Akbas, Yusuf Berk
Ermis, Serhat
Yildiz, Burcin Kepez
Yildirim, Yusuf
TURK OFTALMOLOJI DERGISI-TURKISH JOURNAL OF OPHTHALMOLOGY, 2024, 54 (06): : 313 - 317
[26] A Feasibility Study on Automated SQL Exercise Generation with ChatGPT-3.5
Aerts, Willem
Fletcher, George
Miedema, Daphne
PROCEEDINGS OF THE 3RD ACM SIGMOD INTERNATIONAL WORKSHOP ON DATA SYSTEMS EDUCATION: BRIDGING EDUCATION PRACTICE WITH EDUCATION RESEARCH, DATAED 2024, 2024, : 13 - 19
[27] Comparative performance analysis of ChatGPT 3.5, ChatGPT 4.0 and Bard in answering common patient questions on melanoma<show/>
Deliyannis, Eduardo Panaiotis
Paul, Navreet
Patel, Priya U.
Papanikolaou, Marieta
CLINICAL AND EXPERIMENTAL DERMATOLOGY, 2024, 49 (07) : 743 - 746
[28] ChatGPT-3.5解决物理问题的表现研究
童大振
任红梅
中学物理, 2023, 41 (09) : 11 - 14
[29] Teaching AI In a Scholar Context Organizing a Workshop On CHATGPT-3.5
Oyarzun, Javier
Computers in Libraries, 2024, 44 (06) : 36 - 40
[30] Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing
Sumbal, Anusha
Sumbal, Ramish
Amir, Alina
JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT, 2024, 11

← 1 2 3 4 5 →