Large Language Model-Based Chatbot vs Surgeon-Generated Informed Consent Documentation for Common Procedures

被引:52
|
作者
Decker, Hannah [1 ,2 ]
Trang, Karen [2 ]
Ramirez, Joel [2 ]
Colley, Alexis [2 ]
Pierce, Logan [3 ]
Coleman, Melissa [2 ]
Bongiovanni, Tasce [2 ]
Melton, Genevieve B. [4 ,5 ]
Wick, Elizabeth [2 ]
机构
[1] Univ Calif San Francisco, Philip R Lee Inst Hlth Policy Studies, Mission Bay Campus,Valley Tower,490 Illinois St,7t, San Francisco, CA 94158 USA
[2] Univ Calif San Francisco, Dept Surg, San Francisco, CA 94158 USA
[3] Univ Calif San Francisco, Dept Med, San Francisco, CA 94158 USA
[4] Univ Minnesota, Inst Hlth Informat, Dept Surg, Minneapolis, MN USA
[5] Univ Minnesota, Ctr Learning Hlth Syst Sci, Minneapolis, MN USA
关键词
READABILITY;
D O I
10.1001/jamanetworkopen.2023.36997
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Importance Informed consent is a critical component of patient care before invasive procedures, yet it is frequently inadequate. Electronic consent forms have the potential to facilitate patient comprehension if they provide information that is readable, accurate, and complete; it is not known if large language model (LLM)-based chatbots may improve informed consent documentation by generating accurate and complete information that is easily understood by patients.Objective To compare the readability, accuracy, and completeness of LLM-based chatbot- vs surgeon-generated information on the risks, benefits, and alternatives (RBAs) of common surgical procedures.Design, Setting, and Participants This cross-sectional study compared randomly selected surgeon-generated RBAs used in signed electronic consent forms at an academic referral center in San Francisco with LLM-based chatbot-generated (ChatGPT-3.5, OpenAI) RBAs for 6 surgical procedures (colectomy, coronary artery bypass graft, laparoscopic cholecystectomy, inguinal hernia repair, knee arthroplasty, and spinal fusion).Main Outcomes and Measures Readability was measured using previously validated scales (Flesh-Kincaid grade level, Gunning Fog index, the Simple Measure of Gobbledygook, and the Coleman-Liau index). Scores range from 0 to greater than 20 to indicate the years of education required to understand a text. Accuracy and completeness were assessed using a rubric developed with recommendations from LeapFrog, the Joint Commission, and the American College of Surgeons. Both composite and RBA subgroup scores were compared.Results The total sample consisted of 36 RBAs, with 1 RBA generated by the LLM-based chatbot and 5 RBAs generated by a surgeon for each of the 6 surgical procedures. The mean (SD) readability score for the LLM-based chatbot RBAs was 12.9 (2.0) vs 15.7 (4.0) for surgeon-generated RBAs (P = .10). The mean (SD) composite completeness and accuracy score was lower for surgeons' RBAs at 1.6 (0.5) than for LLM-based chatbot RBAs at 2.2 (0.4) (P < .001). The LLM-based chatbot scores were higher than the surgeon-generated scores for descriptions of the benefits of surgery (2.3 [0.7] vs 1.4 [0.7]; P < .001) and alternatives to surgery (2.7 [0.5] vs 1.4 [0.7]; P < .001). There was no significant difference in chatbot vs surgeon RBA scores for risks of surgery (1.7 [0.5] vs 1.7 [0.4]; P = .38).Conclusions and Relevance The findings of this cross-sectional study suggest that despite not being perfect, LLM-based chatbots have the potential to enhance informed consent documentation. If an LLM were embedded in electronic health records in a manner compliant with the Health Insurance Portability and Accountability Act, it could be used to provide personalized risk information while easing documentation burden for physicians.
引用
收藏
页数:10
相关论文
共 6 条
  • [1] Comparing ChatGPT vs Surgeon-Generated Informed Consent Documentation for Plastic Surgery Procedures
    Patel, Ishan
    Om, Anjali
    Cuzzone, Daniel
    Nores, Gabriela Garcia
    AESTHETIC SURGERY JOURNAL OPEN FORUM, 2024, 6
  • [2] Large Language Model-based Chatbot as a Source of Advice on First Aid in Heart Attack
    Birkun, Alexei A.
    Gautam, Adhish
    CURRENT PROBLEMS IN CARDIOLOGY, 2024, 49 (01)
  • [3] ThaiNutriChat: development of a Thai large language model-based chatbot for health food services
    Luangaphirom, Thananan
    Jocknoi, Lojrutai
    Wunchum, Chalermchai
    Chokerungreang, Kittitee
    Siriborvornratanakul, Thitirat
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [4] The Potential of Large Language Model-Based Chatbot Solutions for Supplementary Counseling in Gestational Diabetes Care
    Lindstrom, Lukas
    Clausen, Mia
    Jensen, Nina Albrektsen
    Nielsen, Maria Hartman
    Nikontovic, Amar
    Cichosz, Simon Lebech
    JOURNAL OF DIABETES SCIENCE AND TECHNOLOGY, 2024, 18 (05): : 1247 - 1248
  • [5] Conversational Guide for Cataract Surgery Complications: A Comparative Study of Surgeons versus Large Language Model-Based Chatbot Generated Instructions for Patient Interaction
    Sundaramoorthy, Sathishkumar
    Ratra, Vineet
    Shankar, Vijay
    Dorairajan, Ramesh
    Maskati, Quresh
    Fredrick, T. Nirmal
    Ratra, Aashna
    Ratra, Dhanashree
    OPHTHALMIC EPIDEMIOLOGY, 2025,
  • [6] Impact of assignment completion assisted by Large Language Model-based chatbot on middle school students' learning
    Zhu, Yumeng
    Zhu, Caifeng
    Wu, Tao
    Wang, Shulei
    Zhou, Yiyun
    Chen, Jingyuan
    Wu, Fei
    Li, Yan
    EDUCATION AND INFORMATION TECHNOLOGIES, 2025, 30 (02) : 2429 - 2461