ChatGPT Performance on the American Urological Association Self-assessment Study Program and the Potential Influence of Artificial Intelligence in Urologic Training

被引：39

作者：

Deebel, Nicholas A. ^{[1
,2
]}

Terlecki, Ryan ^{[1
]}

机构：

[1] Wake Forest Univ, Bowman Gray Sch Med, Dept Urol, Winston Salem, NC USA

[2] Wake Forest Univ, Bowman Gray Sch Med, Dept Urol, 1 Med Ctr Blvd, Winston Salem, NC 27157 USA

来源：

UROLOGY | 2023年 / 177卷

关键词：

D O I：

10.1016/j.urology.2023.05.010

中图分类号：

R5 [内科学]; R69 [泌尿科学（泌尿生殖系疾病）];

学科分类号：

1002 ; 100201 ;

摘要：

OBJECTIVE To assess chat generative pre-trained transformer's (ChatGPT) performance on the American Urological Association Self-Assessment Study Program (AUA SASP) and stratify performance by question stem complexity. METHODS Questions from the 2021-2022 AUA SASP program were administered to ChatGPT version 3 (ChatGPT-3). Questions were administered to the model utilizing a standardized prompt. The answer choice selected by ChatGPT was then used to answer the question stem in the AUA SASP program. ChatGPT was then prompted to assign a question stem order (first, second, third) to each question. The percentage of correctly answered questions was determined for each order level. All responses provided by ChatGPT were qualitatively assessed for appropriate rationale. RESULTS A total of 268 questions were administered to ChatGPT. ChatGPT performed better on 2021 compared to the 2022 AUA SASP question set, answering 42.3% versus 30.0% of questions correctly (P <.05). Hundred percent of answer explanations provided appropriate, relevant rationale regardless of whether the answer was correct. Further stratification included assessment by question order level. ChatGPT performed progressively better on the 2021 question set with decreasing order levels, with first-order questions reaching 53.8% (n = 14). However, differences in proportions did not reach statistical significance (P >.05). CONCLUSION ChatGPT answered many high-level questions correctly and provided a reasonable rationale for each answer choice. While ChatGPT was unable to answer numerous first-order questions, future language processing model learning may lead to the optimization of its fund of knowledge. This may lead to the utilization of artificial intelligence like ChatGPT as an educational tool for urology trainees and professors. UROLOGY 177: 29-33, 2023. (c) 2023 Elsevier Inc. All rights reserved.

引用

页码：29 / 33

页数：5

共 25 条

[1] ChatGPT Performance on the American Urological Association Self-assessment Study Program and the Potential Influence of Artificial Intelligence in Urologic Training EDITORIAL COMMENT
Griebling, Tomas
Kaplan, Damara
UROLOGY, 2023, 177 : 33 - 33
[2] New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology
Huynh, Linda My
Bonebrake, Benjamin T.
Schultis, Kaitlyn
Quach, Alan
Deibert, Christopher M.
UROLOGY PRACTICE, 2023, 10 (04) : 408 - +
[3] New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology Editorial Commentary
Jones, J. Stephen
UROLOGY PRACTICE, 2023, 10 (04) : 416 - 416
[4] Artificial Intelligence ChatGPT and GPT4 Performance on Male and Female Sexual Dysfunction, Sexually Transmitted Infection, and Male Factor Infertility in the 2019 to 2023 American Urological Association Self-Assessment Study Programs
Seyam, R. M.
Khan, B. S.
Aljazaeri, S. A.
Arabi, T. Z.
Alkhateeb, S. S.
Alotaibi, M. F.
Altaweel, W. M.
JOURNAL OF SEXUAL MEDICINE, 2024, 21
[5] Artificial Intelligence on the Exam Table: ChatGPT's Advancement in Urology Self-assessment
Cadiente, Angelo
Chen, Jamie
Nguyen, Jennifer
Sadeghi-Nejad, Hossein
Billah, Mubashir
UROLOGY PRACTICE, 2023, 10 (06) : 521 - 523
[6] The Performance of ChatGPT on the American Society for Surgery of the Hand Self-Assessment Examination
Arango, Sebastian D.
Flynn, Jason C.
Zeitlin, Jacob
Wilson, Matthew S.
Strohl, Adam B.
Weiss, Lawrence E.
Weir, Tristan B.
CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (04)
[7] Google Bard Artificial Intelligence vs the 2022 Self-Assessment Study Program for Urology
Huynh, Linda My
Bonebrake, Benjamin T.
Schultis, Kaitlyn
Quach, Alan
Deibert, Christopher M.
UROLOGY PRACTICE, 2023, 10 (06)
[8] My AI Ate My Homework: Measuring ChatGPT Performance on the American College of Cardiology Self-Assessment Program
Hossain, Afif
Shaikh, Anam
Hossain, Sarah
Shaikh, Amina
Thomas, Renjit
CIRCULATION, 2024, 150
[9] Performance evaluation of ChatGPT 4.0 on cardiovascular questions from the medical knowledge self-assessment program
Malkani, K.
Zhang, R.
Zhao, A.
Jain, R.
Collins, G. P.
Parker, M.
Maizes, D.
Zhang, R.
Kini, V
EUROPEAN HEART JOURNAL, 2024, 45
[10] <bold>Self-assessment of university students on the application and potential of Artificial Intelligence for their formation</bold>
Aguilar, Nivia T. Alvarez
Cubero, Arnulfo Trevino
Elizondo, Jaime Arturo Castillo
ATENAS, 2024, (62):

← 1 2 3 →