Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer

被引:70
|
作者
Pan, Alexander [1 ]
Musheyev, David [1 ]
Bockelman, Daniel [1 ]
Loeb, Stacy [2 ,3 ,4 ]
Kabarriti, Abdo E. [1 ,5 ]
机构
[1] SUNY Downstate Hlth Sci Univ, Dept Urol, New York, NY USA
[2] New York Univ, Med Sch, Dept Psychiat, New York, NY USA
[3] New York Univ, Dept Populat Hlth, Sch Med, New York, NY USA
[4] VA New York Harbor Hlth Care, Dept Med, New York, NY USA
[5] SUNY Downstate Hlth Sci Univ, Dept Urol, 451 Clarkson Ave, Brooklyn, NY 11203 USA
关键词
D O I
10.1001/jamaoncol.2023.2947
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Importance Consumers are increasingly using artificial intelligence (AI) chatbots as a source of information. However, the quality of the cancer information generated by these chatbots has not yet been evaluated using validated instruments. Objective To characterize the quality of information and presence of misinformation about skin, lung, breast, colorectal, and prostate cancers generated by 4 AI chatbots. Design, Setting, and Participants This cross-sectional study assessed AI chatbots' text responses to the 5 most commonly searched queries related to the 5 most common cancers using validated instruments. Search data were extracted from the publicly available Google Trends platform and identical prompts were used to generate responses from 4 AI chatbots: ChatGPT version 3.5 (OpenAI), Perplexity (Perplexity.AI), Chatsonic (Writesonic), and Bing AI (Microsoft). Exposures Google Trends' top 5 search queries related to skin, lung, breast, colorectal, and prostate cancer from January 1, 2021, to January 1, 2023, were input into 4 AI chatbots. Main Outcomes and Measures The primary outcomes were the quality of consumer health information based on the validated DISCERN instrument (scores from 1 [low] to 5 [high] for quality of information) and the understandability and actionability of this information based on the understandability and actionability domains of the Patient Education Materials Assessment Tool (PEMAT) (scores of 0%-100%, with higher scores indicating a higher level of understandability and actionability). Secondary outcomes included misinformation scored using a 5-item Likert scale (scores from 1 [no misinformation] to 5 [high misinformation]) and readability assessed using the Flesch-Kincaid Grade Level readability score. Results The analysis included 100 responses from 4 chatbots about the 5 most common search queries for skin, lung, breast, colorectal, and prostate cancer. The quality of text responses generated by the 4 AI chatbots was good (median [range] DISCERN score, 5 [2-5]) and no misinformation was identified. Understandability was moderate (median [range] PEMAT Understandability score, 66.7% [33.3%-90.1%]), and actionability was poor (median [range] PEMAT Actionability score, 20.0% [0%-40.0%]). The responses were written at the college level based on the Flesch-Kincaid Grade Level score. Conclusions and Relevance Findings of this cross-sectional study suggest that AI chatbots generally produce accurate information for the top cancer-related search queries, but the responses are not readily actionable and are written at a college reading level. These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information.
引用
收藏
页码:1437 / 1440
页数:4
相关论文
共 50 条
  • [1] Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
    Cei, Francesco
    Cacciamani, Giovanni Enrico
    EUROPEAN UROLOGY, 2024, 86 (03) : 278 - 279
  • [2] Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis
    Brzezinski, Jakub
    Olszewski, Robert
    JOURNAL OF MEDICAL SYSTEMS, 2024, 48 (01)
  • [3] Assessing the readability, reliability, and quality of artificial intelligence chatbot responses to the 100 most searched queries about cardiopulmonary resuscitation: An observational study
    Arca, Dilek Omur
    Erdemir, Ismail
    Kara, Fevzi
    Shermatov, Nurgazy
    Odacioglu, Muruvvet
    Ibisoglu, Emel
    Hanci, Ferid Baran
    Sagiroglu, Gonul
    Hanci, Volkan
    MEDICINE, 2024, 103 (22) : E38352
  • [4] Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma
    Khabaz, Kameel
    Newman-Hung, Nicole J.
    Kallini, Jennifer R.
    Kendal, Joseph
    Christ, Alexander B.
    Bernthal, Nicholas M.
    Wessel, Lauren E.
    JOURNAL OF SURGICAL ONCOLOGY, 2024,
  • [5] Evaluating insomnia queries from an artificial intelligence chatbot for patient education
    Alapati, Rahul
    Campbell, Daniel
    Molin, Nicole
    Creighton, Erin
    Wei, Zhikui
    Boon, Maurits
    Huntley, Colin
    JOURNAL OF CLINICAL SLEEP MEDICINE, 2024, 20 (04): : 583 - 594
  • [6] Evaluating the Quality of Artificial Intelligence Chatbot Responses to Patient Questions on Bladder Cancer
    Collin, Harry
    Roberts, Matthew
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2023, 19 : 67 - 67
  • [7] Physician and Artificial Intelligence Chatbot Responses to Cancer Questions From Social Media
    Chen, David
    Parsa, Rod
    Hope, Andrew
    Hannon, Breffni
    Mak, Ernie
    Eng, Lawson
    Liu, Fei-Fei
    Fallah-Rad, Nazanin
    Heesters, Ann M.
    Raman, Srinivas
    JAMA ONCOLOGY, 2024, 10 (07) : 956 - 960
  • [8] Ensuring Safety and Consistency in Artificial Intelligence Chatbot Responses
    Zhu, Lingxuan
    Mou, Weiming
    Luo, Peng
    JAMA ONCOLOGY, 2024,
  • [9] Assessing AI Chatbot Responses to Hypothetical Patient Queries About Vertigo
    Bachina, Preetham
    Venkatesan, Arun
    Probasco, John
    Stern, Barney
    Green, Kar
    ANNALS OF NEUROLOGY, 2024, 96 : S205 - S206
  • [10] Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment
    Mihalache, Andrew
    Popovic, Marko M.
    Muni, Rajeev H.
    JAMA OPHTHALMOLOGY, 2023, 141 (06) : 589 - 597