Assessing Ability for ChatGPT to Answer Total Knee Arthroplasty-Related Questions

被引：7

作者：

Magruder, Matthew L. ^{[1
]}

Rodriguez, Ariel N. ^{[1
]}

Wong, Jason C. J. ^{[1
]}

Erez, Orry ^{[1
]}

Piuzzi, Nicolas S. ^{[2
]}

Scuderi, Gil R. ^{[3
]}

Slover, James D. ^{[3
]}

Oh, Jason H. ^{[3
]}

Schwarzkopf, Ran ^{[4
]}

Chen, Antonia F. ^{[5
]}

Iorio, Richard ^{[5
]}

Goodman, Stuart B. ^{[6
]}

Mont, Michael A. ^{[7
]}

机构：

[1] Maimonides Hosp, Dept Orthopaed Surg, 927 49th St, Brooklyn, NY 11219 USA

[2] Cleveland Clin, Dept Orthopaed Surg, Cleveland, OH USA

[3] Lenox Hill Hosp, Northwell Orthopaed Inst, Dept Orthopaed Surg, New York, NY USA

[4] NYU Langone Hlth, Dept Orthopaed Surg, NYU Langone Orthoped, New York, NY USA

[5] Brigham & Womens Hosp, Dept Orthopaed Surg, Boston, MA USA

[6] Stanford Univ, Sch Med, Dept Orthopaed Surg, Redwood City, CA USA

[7] Sinai Hosp Baltimore, Rubin Inst Adv Orthoped, Baltimore, MD USA

来源：

JOURNAL OF ARTHROPLASTY | 2024年 / 39卷 / 08期

关键词：

ChatGPT; artificial intelligence; large language model; total knee arthroplasty; clinical practice guidelines; ARTIFICIAL-INTELLIGENCE; PERFORMANCE; CALL;

D O I：

10.1016/j.arth.2024.02.023

中图分类号：

R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学（修复外科学）];

学科分类号：

摘要：

Background: Artificial intelligence in the field of orthopaedics has been a topic of increasing interest and opportunity in recent years. Its applications are widespread both for physicians and patients, including use in clinical decision-making, in the operating room, and in research. In this study, we aimed to assess the quality of ChatGPT answers when asked questions related to total knee arthroplasty. Methods: ChatGPT prompts were created by turning 15 of the American Academy of Orthopaedic Surgeons Clinical Practice Guidelines into questions. An online survey was created, which included screenshots of each prompt and answers to the 15 questions. Surgeons were asked to grade ChatGPT answers from 1 to 5 based on their characteristics: (1) relevance, (2) accuracy, (3) clarity, (4) completeness, (5) evidence-based, and (6) consistency. There were 11 Adult Joint Reconstruction fellowship-trained surgeons who completed the survey. Questions were subclassified based on the subject of the prompt: (1) risk factors, (2) implant/intraoperative, and (3) pain/functional outcomes. The average and standard deviation for all answers, as well as for each subgroup, were calculated. Inter-rater reliability (IRR) was also calculated. Results: All answer characteristics were graded as being above average (ie, a score > 3). Relevance demonstrated the highest scores (4.43 +/- 0.77) by surgeons surveyed, and consistency demonstrated the lowest scores (3.54 +/- 1.10). ChatGPT prompts in the Risk Factors group demonstrated the best responses, while those in the Pain/Functional Outcome group demonstrated the lowest. The overall IRR was found to be 0.33 (poor reliability), with the highest IRR for relevance (0.43) and the lowest for evidence-based (0.28). Conclusions: ChatGPT can answer questions regarding well-established clinical guidelines in total knee arthroplasty with above-average accuracy but demonstrates variable reliability. This investigation is the first step in understanding large language model artificial intelligence like ChatGPT and how well they perform in the field of arthroplasty. (c) 2024 Elsevier Inc. All rights reserved.

引用

页数：6

共 50 条

[41] Assessing the performance of ChatGPT in answer- ing questions regarding cirrhosis and hepatocellu- lar carcinoma
Yeo, Yee Hui
Samaan, Jamil S.
Ng, Wee Han
Ting, Peng-Sheng
Trivedi, Hirsh
Vipani, Aarshi
Ayoub, Walid
Yang, Ju Dong
Liran, Omer
Spiegel, Brennan
Kuo, Alexander
CLINICAL AND MOLECULAR HEPATOLOGY, 2023, 29 (03) : 721 - 732
[42] Assessing the Patient-Specific Functional Scale's Ability to Detect Early Recovery Following Total Knee Arthroplasty
Stratford, Paul W.
Kennedy, Deborah M.
Wainwright, Amy V.
PHYSICAL THERAPY, 2014, 94 (06): : 838 - 844
[43] ChatGPT is capable of providing satisfactory responses to frequently asked questions regarding total shoulder arthroplasty
Yeramosu, Teja
Johns, William L.
Onor, Gabriel
Menendez, Mariano E.
Namdari, Surena
Hammoud, Sommer
SHOULDER & ELBOW, 2024, 16 (04) : 407 - 412
[44] Alignment in total knee arthroplasty, still more questions than answers...
Thienpont, Emmanuel
Bellemans, Johan
Victor, Jan
Becker, Roland
KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2013, 21 (10) : 2191 - 2193
[45] Assessing ChatGPT's Potential: A Critical Analysis and Future Directions in Total Joint Arthroplasty
Ray, Partha Pratim
Majumder, Poulami
JOURNAL OF ARTHROPLASTY, 2023, 38 (09): : E19 - E20
[46] Is previous knee arthroscopy related to worse results in primary total knee arthroplasty?
Sérgio Rocha Piedade
Alban Pinaroli
Elvire Servien
Philippe Neyret
Knee Surgery, Sports Traumatology, Arthroscopy, 2009, 17 : 328 - 333
[47] Is previous knee arthroscopy related to worse results in primary total knee arthroplasty?
Piedade, Sergio Rocha
Pinaroli, Alban
Servien, Elvire
Neyret, Philippe
KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2009, 17 (04) : 328 - 333
[48] Reply: Assessing the Economic Impact of Liposomal Bupivacaine in Total Knee Arthroplasty
Phillips, Jennifer Ann
Doshi, Amish
ANNALS OF PHARMACOTHERAPY, 2017, 51 (02) : 180 - 180
[49] Assessing the accuracy of patient-specific guides for total knee arthroplasty
Seon, Jong-Keun
Park, Hyeong-Won
Yoo, Seung-Hyun
Song, Eun-Kyoo
KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2016, 24 (11) : 3678 - 3683
[50] Assessing the accuracy of patient-specific guides for total knee arthroplasty
Jong-Keun Seon
Hyeong-Won Park
Seung-Hyun Yoo
Eun-Kyoo Song
Knee Surgery, Sports Traumatology, Arthroscopy, 2016, 24 : 3678 - 3683

← 1 2 3 4 5 →