Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma

被引：0

作者：

Khabaz, Kameel ^{[1
]}

Newman-Hung, Nicole J. ^{[2
]}

Kallini, Jennifer R. ^{[2
]}

Kendal, Joseph ^{[3
]}

Christ, Alexander B. ^{[2
]}

Bernthal, Nicholas M. ^{[2
]}

Wessel, Lauren E. ^{[2
]}

机构：

[1] David Geffen Sch Med UCLA, Los Angeles, CA 90095 USA

[2] Univ Calif Los Angeles, Dept Orthopaed Surg, Los Angeles, CA USA

[3] Univ Calgary, Dept Surg, Calgary, AB, Canada

来源：

JOURNAL OF SURGICAL ONCOLOGY | 2024年

关键词：

artificial intelligence; bone sarcomas; chatbots; chondrosarcoma; Ewing sarcoma; osteosarcoma;

D O I：

10.1002/jso.27966

中图分类号：

R73 [肿瘤学];

学科分类号：

100214 ;

摘要：

Background and ObjectivesThe potential impacts of artificial intelligence (AI) chatbots on care for patients with bone sarcoma is poorly understood. Elucidating potential risks and benefits would allow surgeons to define appropriate roles for these tools in clinical care.MethodsEleven questions on bone sarcoma diagnosis, treatment, and recovery were inputted into three AI chatbots. Answers were assessed on a 5-point Likert scale for five clinical accuracy metrics: relevance to the question, balance and lack of bias, basis on established data, factual accuracy, and completeness in scope. Responses were quantitatively assessed for empathy and readability. The Patient Education Materials Assessment Tool (PEMAT) was assessed for understandability and actionability.ResultsChatbots scored highly on relevance (4.24) and balance/lack of bias (4.09) but lower on basing responses on established data (3.77), completeness (3.68), and factual accuracy (3.66). Responses generally scored well on understandability (84.30%), while actionability scores were low for questions on treatment (64.58%) and recovery (60.64%). GPT-4 exhibited the highest empathy (4.12). Readability scores averaged between 10.28 for diagnosis questions to 11.65 for recovery questions.ConclusionsWhile AI chatbots are promising tools, current limitations in factual accuracy and completeness, as well as concerns of inaccessibility to populations with lower health literacy, may significantly limit their clinical utility.

引用

页数：6

共 50 条

[21] Evaluation of Artificial Intelligence-generated Responses to Common Plastic Surgery Questions
Copeland-Halperin, Libby R.
O'Brien, Lauren
Copeland, Michelle
PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN, 2023, 11 (08) : E5226
[22] Evaluation of Artificial Intelligence-generated Responses to Common Plastic Surgery Questions
Daungsupawong, Hinpetch
Wiwanitkit, Virus
PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN, 2023, 11 (11)
[23] Artificial intelligence-based chatbot patient information on common retinal diseases using ChatGPT
Potapenko, Ivan
Boberg-Ans, Lars Christian
Hansen, Michael Stormly
Klefter, Oliver Niels
van Dijk, Elon H. C.
Subhi, Yousif
ACTA OPHTHALMOLOGICA, 2023, 101 (07) : 829 - 831
[24] Investigating the Use of an Artificial Intelligence Chatbot with General Chemistry Exam Questions
Clark, Ted M.
JOURNAL OF CHEMICAL EDUCATION, 2023, 100 (05) : 1905 - 1916
[25] Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment
Mihalache, Andrew
Popovic, Marko M.
Muni, Rajeev H.
JAMA OPHTHALMOLOGY, 2023, 141 (06) : 589 - 597
[26] Letter to the Editor: Comment on: "Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology"
Lou, Yahui
Qu, Mengqi
JOURNAL OF SURGICAL ONCOLOGY, 2024, 130 (02) : 347 - 348
[27] Evaluating the Responses Given by Artificial Intelligence Chatbot to the Most Commonly Asked Questions About Anesthesia in Perioperative Period
Kilic, Aslihan Gulec
Yesil, Beyza Mehri Buyukgebiz
Inan, Gozde
Eryilmaz, Nuray Camgoz
Satirlar, Zerrin Ozkose
ANESTHESIA AND ANALGESIA, 2024, 139 (06):
[28] Generative Artificial Intelligence Responses to Common Patient-Centric Hand and Wrist Surgery Questions: A Quality and Usability Analysis
Pautler, Benjamin
Marchese, Charles
Swancutt, Makayla
Beutel, Bryan G.
JOURNAL OF HAND SURGERY-ASIAN-PACIFIC VOLUME, 2025,
[29] Performance of an Upgraded Artificial Intelligence Chatbot for Ophthalmic Knowledge Assessment
Mihalache, Andrew
Huang, Ryan S.
Popovic, Marko M.
Muni, Rajeev H.
JAMA OPHTHALMOLOGY, 2023, 141 (08) : 798 - +
[30] Comparing Physician and Artificial Intelligence Chatbot Responses to Post-hysterectomy Questions Posted to a Public Social Media Forum
Beale, S.
Cohen, N.
Secheli, B.
Mcintire, D.
Kho, K.
OBSTETRICS AND GYNECOLOGY, 2025, 145 (5S):

← 1 2 3 4 5 →