New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology
被引:41
|
作者:
Huynh, Linda My
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Huynh, Linda My
[1
]
Bonebrake, Benjamin T.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Bonebrake, Benjamin T.
[2
]
Schultis, Kaitlyn
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Schultis, Kaitlyn
[2
]
Quach, Alan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Div Urol, Omaha, NE USAUniv Nebraska Med Ctr, Omaha, NE USA
Quach, Alan
[3
]
Deibert, Christopher M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Div Urol, Omaha, NE USA
Univ Nebraska Med Ctr, Dept Surg, Div Urol, 987521 Nebraska Med Ctr, Omaha, NE 68198 USAUniv Nebraska Med Ctr, Omaha, NE USA
Deibert, Christopher M.
[3
,4
]
机构:
[1] Univ Nebraska Med Ctr, Omaha, NE USA
[2] Univ Nebraska Med Ctr, Coll Med, Omaha, NE USA
[3] Univ Nebraska Med Ctr, Div Urol, Omaha, NE USA
[4] Univ Nebraska Med Ctr, Dept Surg, Div Urol, 987521 Nebraska Med Ctr, Omaha, NE 68198 USA
artificial intelligence;
medical informatics applications;
urology;
D O I:
10.1097/UPJ.0000000000000406
中图分类号:
R5 [内科学];
R69 [泌尿科学(泌尿生殖系疾病)];
学科分类号:
1002 ;
100201 ;
摘要:
Introduction:Large language models have demonstrated impressive capabilities, but application to medicine remains unclear. We seek to evaluate the use of ChatGPT on the American Urological Association Self-assessment Study Program as an educational adjunct for urology trainees and practicing physicians.Methods:One hundred fifty questions from the 2022 Self-assessment Study Program exam were screened, and those containing visual assets (n=15) were removed. The remaining items were encoded as open ended or multiple choice. ChatGPT's output was coded as correct, incorrect, or indeterminate; if indeterminate, responses were regenerated up to 2 times. Concordance, quality, and accuracy were ascertained by 3 independent researchers and reviewed by 2 physician adjudicators. A new session was started for each entry to avoid crossover learning.Results:ChatGPT was correct on 36/135 (26.7%) open-ended and 38/135 (28.2%) multiple-choice questions. Indeterminate responses were generated in 40 (29.6%) and 4 (3.0%), respectively. Of the correct responses, 24/36 (66.7%) and 36/38 (94.7%) were on initial output, 8 (22.2%) and 1 (2.6%) on second output, and 4 (11.1%) and 1 (2.6%) on final output, respectively. Although regeneration decreased indeterminate responses, proportion of correct responses did not increase. For open-ended and multiple-choice questions, ChatGPT provided consistent justifications for incorrect answers and remained concordant between correct and incorrect answers.Conclusions:ChatGPT previously demonstrated promise on medical licensing exams; however, application to the 2022 Self-assessment Study Program was not demonstrated. Performance improved with multiple-choice over open-ended questions. More importantly were the persistent justifications for incorrect responses-left unchecked, utilization of ChatGPT in medicine may facilitate medical misinformation.
机构:
Univ Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USAUniv Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USA
Huynh, Linda My
Bonebrake, Benjamin T.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USA
Bonebrake, Benjamin T.
Schultis, Kaitlyn
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska Med Ctr, Coll Med, Omaha, NE USAUniv Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USA
Schultis, Kaitlyn
Quach, Alan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska, Div Urol, Med Ctr, Omaha, NE USAUniv Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USA
Quach, Alan
Deibert, Christopher M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Nebraska, Div Urol, Med Ctr, Omaha, NE USA
Univ Nebraska Med Ctr, Div Urol, 987521 Nebraska Med Ctr, Omaha, NE 68198 USAUniv Nebraska Med Ctr, MD PhD Scholars Program, Omaha, NE USA
机构:
Hackensack Meridian Sch Med, Nutley, NJ USA
Hackensack Meridian Sch Med, 123 Metro Blvd, Nutley, NJ 07110 USAHackensack Meridian Sch Med, Nutley, NJ USA
Cadiente, Angelo
Chen, Jamie
论文数: 0引用数: 0
h-index: 0
机构:
Hackensack Meridian Sch Med, Nutley, NJ USAHackensack Meridian Sch Med, Nutley, NJ USA
Chen, Jamie
Nguyen, Jennifer
论文数: 0引用数: 0
h-index: 0
机构:
Hackensack Univ, Med Ctr, Dept Urol, Hackensack, NJ USAHackensack Meridian Sch Med, Nutley, NJ USA
Nguyen, Jennifer
Sadeghi-Nejad, Hossein
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Grossman Sch Med, Dept Urol, New York, NY USAHackensack Meridian Sch Med, Nutley, NJ USA
Sadeghi-Nejad, Hossein
Billah, Mubashir
论文数: 0引用数: 0
h-index: 0
机构:
Hackensack Meridian Sch Med, Nutley, NJ USA
Hackensack Univ, Med Ctr, Dept Urol, Hackensack, NJ USAHackensack Meridian Sch Med, Nutley, NJ USA
机构:
Res Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Abu Naim, Belal
Ghafourian, Yasin
论文数: 0引用数: 0
h-index: 0
机构:
Res Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Ghafourian, Yasin
Tauber, Markus
论文数: 0引用数: 0
h-index: 0
机构:
Res Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Tauber, Markus
Lindner, Fabian
论文数: 0引用数: 0
h-index: 0
机构:
Zittau Gorlitz Univ Appl Sci, Fac Business Adm & Engn, Zittau, GermanyRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Lindner, Fabian
论文数: 引用数:
h-index:
机构:
Schmittner, Christoph
Schoitsch, Erwin
论文数: 0引用数: 0
h-index: 0
机构:
Austrian Inst Technol AIT GmbH, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Schoitsch, Erwin
Schneider, Germar
论文数: 0引用数: 0
h-index: 0
机构:
Infineon Technol Dresden GmbH, Commun, Innovat, Funding, Dresden, GermanyRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Schneider, Germar
Kattan, Olga
论文数: 0引用数: 0
h-index: 0
机构:
Philips, Drachten, NetherlandsRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Kattan, Olga
论文数: 引用数:
h-index:
机构:
Reiner, Gerald
Ryabokon, Anna
论文数: 0引用数: 0
h-index: 0
机构:
TTTech Ind Automat AG, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Ryabokon, Anna
Flamigni, Francesca
论文数: 0引用数: 0
h-index: 0
机构:
TTTech Ind Automat AG, Vienna, AustriaRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Flamigni, Francesca
Karathanasopoulou, Konstantina
论文数: 0引用数: 0
h-index: 0
机构:
Harokopio Univ, Athens, GreeceRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Karathanasopoulou, Konstantina
Dimitrakopoulos, George
论文数: 0引用数: 0
h-index: 0
机构:
Harokopio Univ, Athens, GreeceRes Studios Austria Forsch Gesell RSA FG, Studio Smart Digital Ind & Serv, Vienna, Austria
Dimitrakopoulos, George
PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024,
2024,
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Malkani, K.
Zhang, R.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Horace Mann, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Zhang, R.
Zhao, A.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Zhao, A.
Jain, R.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Jain, R.
Collins, G. P.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Collins, G. P.
Parker, M.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Parker, M.
Maizes, D.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Maizes, D.
Zhang, R.
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Horace Mann, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA
Zhang, R.
Kini, V
论文数: 0引用数: 0
h-index: 0
机构:
New York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USANew York Presbyterian Hosp, Weill Cornell Med Ctr, New York, NY USA