Evaluation of responses to cardiac imaging questions by the artificial intelligence large language model ChatGPT

被引:6
|
作者
Monroe, Cynthia L. [1 ]
Abdelhafez, Yasser G. [2 ]
Atsina, Kwame [3 ]
Aman, Edris [3 ]
Nardo, Lorenzo [2 ]
Madani, Mohammad H. [2 ]
机构
[1] Calif Northstate Univ, Coll Med, 9700 W Taron Dr, Elk Grove, CA 95757 USA
[2] Univ Calif Davis, Med Ctr, Dept Radiol, 4860 Y St,Suite 3100, Sacramento, CA 95817 USA
[3] Univ Calif Davis, Med Ctr, Div Cardiovasc Med, 4860 Y St,Suite 0200, Sacramento, CA 95817 USA
关键词
Accuracy; Cardiac imaging; ChatGPT; Patient education; EXPERT CONSENSUS DOCUMENT; COMPUTED-TOMOGRAPHY SCCT; CORONARY-ARTERY-DISEASE; AMERICAN-COLLEGE; RADIOLOGY ACR; SOCIETY;
D O I
10.1016/j.clinimag.2024.110193
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose: To assess ChatGPT 's ability as a resource for educating patients on various aspects of cardiac imaging, including diagnosis, imaging modalities, indications, interpretation of radiology reports, and management. Methods: 30 questions were posed to ChatGPT-3.5 and ChatGPT-4 three times in three separate chat sessions. Responses were scored as correct, incorrect, or clinically misleading categories by three observers -two board certified cardiologists and one board certified radiologist with cardiac imaging subspecialization. Consistency of responses across the three sessions was also evaluated. Final categorization was based on majority vote between at least two of the three observers. Results: ChatGPT-3.5 answered seventeen of twenty eight questions correctly (61 %) by majority vote. Twenty one of twenty eight questions were answered correctly (75 %) by ChatGPT-4 by majority vote. Majority vote for correctness was not achieved for two questions. Twenty six of thirty questions were answered consistently by ChatGPT-3.5 (87 %). Twenty nine of thirty questions were answered consistently by ChatGPT-4 (97 %). ChatGPT-3.5 had both consistent and correct responses to seventeen of twenty eight questions (61 %). ChatGPT-4 had both consistent and correct responses to twenty of twenty eight questions (71 %). Conclusion: ChatGPT-4 had overall better performance than ChatGTP-3.5 when answering cardiac imaging questions with regard to correctness and consistency of responses. While both ChatGPT-3.5 and ChatGPT-4 answers over half of cardiac imaging questions correctly, inaccurate, clinically misleading and inconsistent responses suggest the need for further refinement before its application for educating patients about cardiac imaging.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] The complex ethics of applying ChatGPT and language model artificial intelligence in dermatology
    Ferreira, Alana Luna
    Lipoff, Jules B.
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2023, 89 (04) : E157 - E158
  • [22] The Accuracy of Artificial Intelligence ChatGPT in Oncology Examination Questions
    Chow, Ronald
    Hasan, Shaakir
    Zheng, Ajay
    Gao, Chenxi
    Valdes, Gilmer
    Yu, Francis
    Chhabra, Arpit
    Raman, Srinivas
    Choi, J. Isabelle
    Lin, Haibo
    Simone, Charles B.
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2024, 21 (11) : 1800 - 1804
  • [23] A Generative Artificial Intelligence Using Multilingual Large Language Models for ChatGPT Applications
    Tuan, Nguyen Trung
    Moore, Philip
    Thanh, Dat Ha Vu
    Pham, Hai Van
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [24] Artificial intelligence-based ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis
    Rasmussen, Marie Louise Roed
    Larsen, Ann-Cathrine
    Subhi, Yousif
    Potapenko, Ivan
    GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2023, 261 (10) : 3041 - 3043
  • [25] Artificial intelligence-based ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis
    Marie Louise Roed Rasmussen
    Ann-Cathrine Larsen
    Yousif Subhi
    Ivan Potapenko
    Graefe's Archive for Clinical and Experimental Ophthalmology, 2023, 261 : 3041 - 3043
  • [26] Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery
    Jamil S. Samaan
    Yee Hui Yeo
    Nithya Rajeev
    Lauren Hawley
    Stuart Abel
    Wee Han Ng
    Nitin Srinivasan
    Justin Park
    Miguel Burch
    Rabindra Watson
    Omer Liran
    Kamran Samakar
    Obesity Surgery, 2023, 33 : 1790 - 1796
  • [27] Response to "Large language model artificial intelligence: the current state and future of ChatGPT in neuro-oncology publishing"
    Gupta, Nithin K. K.
    Doyle, David M. M.
    D'Amico, Randy S. S.
    JOURNAL OF NEURO-ONCOLOGY, 2023, 163 (03) : 731 - 733
  • [28] Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery
    Samaan, Jamil S.
    Yeo, Yee Hui
    Rajeev, Nithya
    Hawley, Lauren
    Abel, Stuart
    Ng, Wee Han
    Srinivasan, Nitin
    Park, Justin
    Burch, Miguel
    Watson, Rabindra
    Liran, Omer
    Samakar, Kamran
    OBESITY SURGERY, 2023, 33 (06) : 1790 - 1796
  • [29] Use of ChatGPT4, an Online Artificial Intelligence Large Language Model, for Predicting Colonoscopy Screening Intervals
    Chang, Patrick
    Amini, Maziar
    Nguyen, Denis
    Lee, Helen
    Phan, Jennifer
    Buxbaum, James
    Sahakian, Ara B.
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2023, 118 (10): : S225 - S225
  • [30] Response to “Large language model artificial intelligence: the current state and future of ChatGPT in neuro-oncology publishing”
    Nithin K. Gupta
    David M. Doyle
    Randy S. D’Amico
    Journal of Neuro-Oncology, 2023, 163 : 731 - 733