New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology
被引:41
|
作者:
Huynh, Linda My
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Huynh, Linda My
[1
]
Bonebrake, Benjamin T.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Bonebrake, Benjamin T.
[2
]
Schultis, Kaitlyn
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Schultis, Kaitlyn
[2
]
Quach, Alan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Div Urol, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Quach, Alan
[3
]
Deibert, Christopher M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Div Urol, Omaha, NE USA
Univ Nebraska Med Ctr, Dept Surg, Div Urol, 987521 Nebraska Med Ctr, Omaha, NE 68198 USAUniv Nebraska Med Ctr, Omaha, NE USA
Deibert, Christopher M.
[3
,4
]
机构:
[1] Univ Nebraska Med Ctr, Omaha, NE USA
[2] Univ Nebraska Med Ctr, Coll Med, Omaha, NE USA
[3] Univ Nebraska Med Ctr, Div Urol, Omaha, NE USA
[4] Univ Nebraska Med Ctr, Dept Surg, Div Urol, 987521 Nebraska Med Ctr, Omaha, NE 68198 USA
artificial intelligence;
medical informatics applications;
urology;
D O I:
10.1097/UPJ.0000000000000406
中图分类号:
R5 [内科学];
R69 [泌尿科学(泌尿生殖系疾病)];
学科分类号:
1002 ;
100201 ;
摘要:
Introduction:Large language models have demonstrated impressive capabilities, but application to medicine remains unclear. We seek to evaluate the use of ChatGPT on the American Urological Association Self-assessment Study Program as an educational adjunct for urology trainees and practicing physicians.Methods:One hundred fifty questions from the 2022 Self-assessment Study Program exam were screened, and those containing visual assets (n=15) were removed. The remaining items were encoded as open ended or multiple choice. ChatGPT's output was coded as correct, incorrect, or indeterminate; if indeterminate, responses were regenerated up to 2 times. Concordance, quality, and accuracy were ascertained by 3 independent researchers and reviewed by 2 physician adjudicators. A new session was started for each entry to avoid crossover learning.Results:ChatGPT was correct on 36/135 (26.7%) open-ended and 38/135 (28.2%) multiple-choice questions. Indeterminate responses were generated in 40 (29.6%) and 4 (3.0%), respectively. Of the correct responses, 24/36 (66.7%) and 36/38 (94.7%) were on initial output, 8 (22.2%) and 1 (2.6%) on second output, and 4 (11.1%) and 1 (2.6%) on final output, respectively. Although regeneration decreased indeterminate responses, proportion of correct responses did not increase. For open-ended and multiple-choice questions, ChatGPT provided consistent justifications for incorrect answers and remained concordant between correct and incorrect answers.Conclusions:ChatGPT previously demonstrated promise on medical licensing exams; however, application to the 2022 Self-assessment Study Program was not demonstrated. Performance improved with multiple-choice over open-ended questions. More importantly were the persistent justifications for incorrect responses-left unchecked, utilization of ChatGPT in medicine may facilitate medical misinformation.
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Li, Wanjiang
You, Yongchun
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
You, Yongchun
Zhong, Sihua
论文数: 0引用数: 0
h-index: 0
机构:
United Imaging Healthcare, Res Ctr Inst, Shanghai, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Zhong, Sihua
Shuai, Tao
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Shuai, Tao
Liao, Kai
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Liao, Kai
Yu, Jianqun
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Yu, Jianqun
Zhao, Jin
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Zhao, Jin
Li, Zhenlin
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Li, Zhenlin
Lu, Chunyan
论文数: 0引用数: 0
h-index: 0
机构:
Sichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
Sichuan Univ, Dept Radiol, West China Hosp, 37 Guo Xue Xiang, Chengdu 610041, Sichuan, Peoples R ChinaSichuan Univ, Dept Radiol, West China Hosp, Chengdu, Sichuan, Peoples R China
机构:
Capital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R ChinaCapital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R China
An, Ning
Lin, Ji Sheng
论文数: 0引用数: 0
h-index: 0
机构:
Capital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R ChinaCapital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R China
Lin, Ji Sheng
Fei, Qi
论文数: 0引用数: 0
h-index: 0
机构:
Capital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R ChinaCapital Med Univ, Beijing Friendship Hosp, Dept Orthoped, Beijing 100050, Peoples R China