Comparative analysis of ChatGPT and Bard in answering pathology examination questions requiring image interpretation

被引：8

作者：

Apornvirat, Sompon ^{[1
,2
]}

Namboonlue, Chutimon ^{[3
]}

Laohawetwanit, Thiyaphat ^{[1
,2
]}

机构：

[1] Thammasat Univ, Chulabhorn Int Coll Med, Div Pathol, Pathum Thani, Thailand

[2] Thammasat Univ Hosp, Div Pathol, Pathum Thani, Thailand

[3] Dr Pong Clin, Bangkok, Thailand

来源：

AMERICAN JOURNAL OF CLINICAL PATHOLOGY | 2024年 / 162卷 / 03期

关键词：

artificial intelligence; pathology; diagnosis;

D O I：

10.1093/ajcp/aqae036

中图分类号：

R36 [病理学];

学科分类号：

100104 ;

摘要：

Objectives To evaluate the accuracy of ChatGPT and Bard in answering pathology examination questions requiring image interpretation.Methods The study evaluated ChatGPT-4 and Bard's performance using 86 multiple-choice questions, with 17 (19.8%) focusing on general pathology and 69 (80.2%) on systemic pathology. Of these, 62 (72.1%) included microscopic images, and 57 (66.3%) were first-order questions focusing on diagnosing the disease. The authors presented these artificial intelligence (AI) tools with questions, both with and without clinical contexts, and assessed their answers against a reference standard set by pathologists.Results ChatGPT-4 achieved a 100% (n = 86) accuracy rate in questions with clinical context, surpassing Bard's 87.2% (n = 75). Without context, the accuracy of both AI tools declined significantly, with ChatGPT-4 at 52.3% (n = 45) and Bard at 38.4% (n = 33). ChatGPT-4 consistently outperformed Bard across various categories, particularly in systemic pathology and first-order questions. A notable issue identified was Bard's tendency to "hallucinate" or provide plausible but incorrect answers, especially without clinical context.Conclusions This study demonstrated the potential of ChatGPT and Bard in pathology education, stressing the importance of clinical context for accurate AI interpretations of pathology images. It underlined the need for careful AI integration in medical education.

引用

页码：252 / 260

页数：9

共 50 条

[1] Comparative performance analysis of ChatGPT 3.5, ChatGPT 4.0 and Bard in answering common patient questions on melanoma<show/>
Deliyannis, Eduardo Panaiotis
Paul, Navreet
Patel, Priya U.
Papanikolaou, Marieta
CLINICAL AND EXPERIMENTAL DERMATOLOGY, 2024, 49 (07) : 743 - 746
[2] AI IN HEPATOLOGY: A COMPARATIVE ANALYSIS OF CHATGPT-4, BING, AND BARD AT ANSWERING CLINICAL QUESTIONS
Anvari, Sama
Lee, Yung
Jin, David S.
Malone, Sarah
Collins, Matthew
GASTROENTEROLOGY, 2024, 166 (05) : S888 - S888
[3] Artificial intelligence in hepatology: a comparative analysis of ChatGPT-4, Bing, and Bard at answering clinical questions
Anvari, Sama
Lee, Yung
Jin, David Shiqiang
Malone, Sarah
Collins, Matthew
JOURNAL OF THE CANADIAN ASSOCIATION OF GASTROENTEROLOGY, 2025,
[4] Performance of ChatGPT versus Google Bard on Answering Postgraduate-Level Surgical Examination Questions: A Meta-Analysis
Andrew, Albert
Zhao, Sunny
INDIAN JOURNAL OF SURGERY, 2025,
[5] A Comparative Analysis of ChatGPT-4, Microsoft's Bing and Google's Bard at Answering Rheumatology Clinical Questions
Yingchoncharoen, Pitchaporn
Chaisrimaneepan, Nattanicha
Pangkanon, Watsachon
Thongpiya, Jerapas
ARTHRITIS & RHEUMATOLOGY, 2024, 76 : 2654 - 2655
[6] Large language models in pathology: A comparative study of ChatGPT and Bard with pathology trainees on multiple-choice questions
Du, Wei
Jin, Xueting
Harris, Jaryse Carol
Brunetti, Alessandro
Johnson, Erika
Leung, Olivia
Li, Xingchen
Walle, Selemon
Yu, Qing
Zhou, Xiao
Bian, Fang
Mckenzie, Kajanna
Kanathanavanich, Manita
Ozcelik, Yusuf
El-Sharkawy, Farah
Koga, Shunsuke
ANNALS OF DIAGNOSTIC PATHOLOGY, 2024, 73
[7] Nephrology Tools: Using Chatbots for Image Interpretation and Answering Questions
Garcia Valencia, Oscar Alejandro
Thongprayoon, Charat
Krisanapan, Pajaree
Suppadungsuk, Supawadee
Cheungpasitporn, Wisit
Miao Jing
JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2024, 35 (10):
[8] Evaluation of ChatGPT's Proficiency in Pathology: An Analysis of Textual and Image-Based Questions
Khan, Anam
Khan, Atif
Faraz, Muhammad
Parwani, Anil
Singh, Rajendra
Amin, Bijal
LABORATORY INVESTIGATION, 2024, 104 (03) : S1583 - S1585
[9] Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis
Birkan Eyup Yilmaz
Busra Nur Gokkurt Yilmaz
Furkan Ozbey
BMC Oral Health, 25 (1)
[10] A comparative analysis of the ethics of gene editing: ChatGPT vs. Bard
Burright, Jack
Al-khateeb, Samer
COMPUTATIONAL AND MATHEMATICAL ORGANIZATION THEORY, 2024,

← 1 2 3 4 5 →