Evaluation of responses to cardiac imaging questions by the artificial intelligence large language model ChatGPT

被引:6
|
作者
Monroe, Cynthia L. [1 ]
Abdelhafez, Yasser G. [2 ]
Atsina, Kwame [3 ]
Aman, Edris [3 ]
Nardo, Lorenzo [2 ]
Madani, Mohammad H. [2 ]
机构
[1] Calif Northstate Univ, Coll Med, 9700 W Taron Dr, Elk Grove, CA 95757 USA
[2] Univ Calif Davis, Med Ctr, Dept Radiol, 4860 Y St,Suite 3100, Sacramento, CA 95817 USA
[3] Univ Calif Davis, Med Ctr, Div Cardiovasc Med, 4860 Y St,Suite 0200, Sacramento, CA 95817 USA
关键词
Accuracy; Cardiac imaging; ChatGPT; Patient education; EXPERT CONSENSUS DOCUMENT; COMPUTED-TOMOGRAPHY SCCT; CORONARY-ARTERY-DISEASE; AMERICAN-COLLEGE; RADIOLOGY ACR; SOCIETY;
D O I
10.1016/j.clinimag.2024.110193
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose: To assess ChatGPT 's ability as a resource for educating patients on various aspects of cardiac imaging, including diagnosis, imaging modalities, indications, interpretation of radiology reports, and management. Methods: 30 questions were posed to ChatGPT-3.5 and ChatGPT-4 three times in three separate chat sessions. Responses were scored as correct, incorrect, or clinically misleading categories by three observers -two board certified cardiologists and one board certified radiologist with cardiac imaging subspecialization. Consistency of responses across the three sessions was also evaluated. Final categorization was based on majority vote between at least two of the three observers. Results: ChatGPT-3.5 answered seventeen of twenty eight questions correctly (61 %) by majority vote. Twenty one of twenty eight questions were answered correctly (75 %) by ChatGPT-4 by majority vote. Majority vote for correctness was not achieved for two questions. Twenty six of thirty questions were answered consistently by ChatGPT-3.5 (87 %). Twenty nine of thirty questions were answered consistently by ChatGPT-4 (97 %). ChatGPT-3.5 had both consistent and correct responses to seventeen of twenty eight questions (61 %). ChatGPT-4 had both consistent and correct responses to twenty of twenty eight questions (71 %). Conclusion: ChatGPT-4 had overall better performance than ChatGTP-3.5 when answering cardiac imaging questions with regard to correctness and consistency of responses. While both ChatGPT-3.5 and ChatGPT-4 answers over half of cardiac imaging questions correctly, inaccurate, clinically misleading and inconsistent responses suggest the need for further refinement before its application for educating patients about cardiac imaging.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Comment on "Evaluation of responses to cardiac imaging questions by the artificial intelligence large language model ChatGPT"
    Lone, Mohd Rafi
    Sohail, Shahab Saquib
    CLINICAL IMAGING, 2024, 114
  • [2] Investigating the Accuracy and Completeness of an Artificial Intelligence Large Language Model About Uveitis: An Evaluation of ChatGPT
    Marshall, Rayna F.
    Mallem, Krishna
    Xu, Hannah
    Thorne, Jennifer
    Burkholder, Bryn
    Chaon, Benjamin
    Liberman, Paulina
    Berkenstock, Meghan
    OCULAR IMMUNOLOGY AND INFLAMMATION, 2024, 32 (09) : 2052 - 2055
  • [3] Assessing the utility of ChatGPT as an artificial intelligence-based large language model for information to answer questions on myopia
    Biswas, Sayantan
    Logan, Nicola S. S.
    Davies, Leon N. N.
    Sheppard, Amy L. L.
    Wolffsohn, James S. S.
    OPHTHALMIC AND PHYSIOLOGICAL OPTICS, 2023, 43 (06) : 1562 - 1570
  • [4] How reliable is the artificial intelligence product large language model ChatGPT in orthodontics?
    Demirsoy, Kevser Kurt
    Buyuk, Suleyman Kutalmis
    Bicer, Tayyip
    ANGLE ORTHODONTIST, 2024, 94 (06) : 602 - 607
  • [5] Authors' Reply: Assessing the utility of ChatGPT as an artificial intelligence-based large language model for information to answer questions on myopia
    Biswas, Sayantan
    Logan, Nicola S.
    Davies, Leon N.
    Sheppard, Amy L.
    Wolffsohn, James S.
    OPHTHALMIC AND PHYSIOLOGICAL OPTICS, 2024, 44 (01) : 233 - 234
  • [6] Artificial intelligence with ChatGPT 4: a large language model in support of ocular oncology cases
    Federico Giannuzzi
    Matteo Mario Carlà
    Lorenzo Hu
    Valentina Cestrone
    Carmela Grazia Caputo
    Maria Grazia Sammarco
    Gustavo Savino
    Stanislao Rizzo
    Maria Antonietta Blasi
    Monica Maria Pagliara
    International Ophthalmology, 45 (1)
  • [7] Performance of large language model artificial intelligence on dermatology board exam questions
    Park, Lily
    Ehlert, Brittany
    Susla, Lyudmyla
    Lum, Zachary C.
    Lee, Patrick K.
    CLINICAL AND EXPERIMENTAL DERMATOLOGY, 2023, 49 (07) : 733 - 734
  • [8] Doctor Versus Artificial Intelligence: Patient and Physician Evaluation of Large Language Model Responses to Rheumatology Patient Questions in a Cross-Sectional Study
    Ye, Carrie
    Zweck, Elric
    Ma, Zechen
    Smith, Justin
    Katz, Steven
    ARTHRITIS & RHEUMATOLOGY, 2024, 76 (03) : 479 - 484
  • [9] The Role and Potential Contributions of the Artificial Intelligence Language Model ChatGPT
    Berse, Soner
    Akca, Kamile
    Dirgar, Ezgi
    Serin, Emine Kaplan
    ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (02) : 130 - 133
  • [10] The Role and Potential Contributions of the Artificial Intelligence Language Model ChatGPT
    Soner Berşe
    Kamile Akça
    Ezgi Dirgar
    Emine Kaplan Serin
    Annals of Biomedical Engineering, 2024, 52 : 130 - 133