Appropriateness of ChatGPT in Answering Heart Failure Related Questions

被引:12
|
作者
King, Ryan C. [1 ]
Samaan, Jamil S. [2 ]
Yeo, Yee Hui [2 ]
Mody, Behram [1 ]
Lombardo, Dawn M. [1 ]
Ghashghaei, Roxana [1 ]
机构
[1] Univ Calif Irvine, Irvine Med Ctr, Dept Med, Div Cardiol, 101 City Dr South, Orange, CA 92868 USA
[2] Cedars Sinai Med Ctr, Dept Med, Karsh Div Gastroenterol & Hepatol, Los Angeles, CA USA
来源
HEART LUNG AND CIRCULATION | 2024年 / 33卷 / 09期
关键词
Heart failure; ChatGPT; Health education; Artificial fi cial intelligence; Equity;
D O I
10.1016/j.hlc.2024.03.005
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Heart failure requires complex management, and increased patient knowledge has been shown to improve outcomes. This study assessed the knowledge of Chat Generative Pre-trained Transformer (ChatGPT) and its appropriateness as a supplemental resource of information for patients with heart failure. Method A total of 107 frequently asked heart failure-related questions were included in 3 categories: "basic knowledge" (49), "management" (41) and "other" (17). Two responses per question were generated using both GPT-3.5 and GPT-4 (i.e., two responses per question per model). The accuracy and reproducibility of responses were graded by two reviewers, board-certified fi ed in cardiology, with differences resolved by a third reviewer, board-certified fi ed in cardiology and advanced heart failure. Accuracy was graded using a four-point scale: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Results GPT-4 provided 107/107 (100%) responses with correct information. Further, GPT-4 displayed a greater proportion of comprehensive knowledge for the categories of "basic knowledge" and "management" (89.8% and 82.9%, respectively). For GPT-3, there were two total responses (1.9%) graded as "some correct and incorrect" for GPT-3.5, while no "completely incorrect" responses were produced. With respect to comprehensive knowledge, GPT-3.5 performed best in the "management" category and "other" category (prognosis, procedures, and support) (78.1%, 94.1%). The models also provided highly reproducible responses, with GPT-3.5 scoring above 94% in every category and GPT-4 with 100% for all answers. Conclusions GPT-3.5 and GPT-4 answered the majority of heart failure-related questions accurately and reliably. If validated in future studies, ChatGPT may serve as a useful tool in the future by providing accessible health-related information and education to patients living with heart failure. In its current state, ChatGPT necessitates further rigorous testing and validation to ensure patient safety and equity across all patient demographics.
引用
收藏
页码:1314 / 1318
页数:5
相关论文
共 50 条
  • [31] ANSWERING PHARMACY-RELATED LEGAL QUESTIONS
    SWARTZ, AJ
    AMERICAN JOURNAL OF HOSPITAL PHARMACY, 1986, 43 (10): : 2381 - &
  • [32] Automatically Answering API-Related Questions
    Wu, Di
    Jing, Xiao-Yuan
    Chen, Haowen
    Zhu, Xiaoke
    Zhang, Hongyu
    Zuo, Mei
    Zi, Lu
    Zhu, Chen
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 270 - 271
  • [33] Comparative performance analysis of ChatGPT 3.5, ChatGPT 4.0 and Bard in answering common patient questions on melanoma<show/>
    Deliyannis, Eduardo Panaiotis
    Paul, Navreet
    Patel, Priya U.
    Papanikolaou, Marieta
    CLINICAL AND EXPERIMENTAL DERMATOLOGY, 2024, 49 (07) : 743 - 746
  • [34] ChatGPT's responses to questions related to epilepsy
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    SEIZURE-EUROPEAN JOURNAL OF EPILEPSY, 2024, 114 : 105 - 105
  • [35] Appropriateness of heart failure therapy in Bulgaria
    Katova, T.
    Simova, I.
    Bayraktarova, I.
    EUROPEAN JOURNAL OF HEART FAILURE, 2016, 18 : 498 - 498
  • [36] ANSWERING QUESTIONS WITH QUESTIONS
    Howard, Ravi
    CALLALOO, 2011, 34 (03) : 724 - 725
  • [37] Accuracy of ChatGPT3.5 in answering clinical questions on guidelines for severe acute pancreatitis
    Qiu, Jun
    Luo, Li
    Zhou, Youlian
    BMC GASTROENTEROLOGY, 2024, 24 (01)
  • [38] Evaluating ChatGPT's Performance in Answering Questions About Allergic Rhinitis and Chronic Rhinosinusitis
    Ye, Fan
    Zhang, He
    Luo, Xin
    Wu, Tong
    Yang, Qintai
    Shi, Zhaohui
    OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 171 (02) : 571 - 577
  • [39] ChatGPT Versus Consultants: Blinded Evaluation on Answering Otorhinolaryngology Case-Based Questions
    Buhr, Christoph Raphael
    Smith, Harry
    Huppertz, Tilman
    Bahr-Hamm, Katharina
    Matthias, Christoph
    Blaikie, Andrew
    Kelsey, Tom
    Kuhn, Sebastian
    Eckrich, Jonas
    JMIR MEDICAL EDUCATION, 2023, 9
  • [40] Comparative analysis of ChatGPT and Bard in answering pathology examination questions requiring image interpretation
    Apornvirat, Sompon
    Namboonlue, Chutimon
    Laohawetwanit, Thiyaphat
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2024, 162 (03) : 252 - 260