Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology

被引:0
|
作者
Masanneck, Lars [1 ]
Meuth, Sven G.
Pawlitzki, Marc
机构
[1] Heinrich Heine Univ Dusseldorf, Med Fac, Dept Neurol, Dusseldorf, Germany
来源
NPJ DIGITAL MEDICINE | 2025年 / 8卷 / 01期
关键词
D O I
10.1038/s41746-025-01536-y
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Effectively managing evidence-based information is increasingly challenging. This study tested large language models (LLMs), including document- and online-enabled retrieval-augmented generation (RAG) systems, using 13 recent neurology guidelines across 130 questions. Results showed substantial variability. RAG improved accuracy compared to base models but still produced potentially harmful answers. RAG-based systems performed worse on case-based than knowledge-based questions. Further refinement and improved regulation is needed for safe clinical integration of RAG-enhanced LLMs.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge
    Tihanyi, Norbert
    Ferrag, Mohamed Amine
    Jain, Ridhi
    Bisztray, Tamas
    Debbah, Merouane
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 296 - 302
  • [2] Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications
    Bora, Arunabh
    Cuayahuitl, Heriberto
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2355 - 2374
  • [3] CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering
    Wiratunga, Nirmalie
    Abeyratne, Ramitha
    Jayawardena, Lasal
    Martin, Kyle
    Massie, Stewart
    Nkisi-Orji, Ikechukwu
    Weerasinghe, Ruvan
    Liret, Anne
    Fleisch, Bruno
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, ICCBR 2024, 2024, 14775 : 445 - 460
  • [4] GMM Adaptation based Online Speaker Segmentation for Spoken Document Retrieval
    Park, Kyungmi
    Park, Jeong-sik
    Oh, Yung-Hwan
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (02) : 1123 - 1129
  • [5] Let Us Not Squander the Affordances of LLMs for the Sake of Expedience: Using Retrieval Augmented Generative AI Chatbots to Support and Evaluate Student Reasoning
    Cooper, Melanie M.
    Klymkowsky, Michael W.
    JOURNAL OF CHEMICAL EDUCATION, 2024, 101 (11) : 4847 - 4856
  • [6] Developing online evidence based nursing guideline document model
    Park, M
    Specht, J
    GERONTOLOGIST, 2004, 44 : 183 - 183
  • [7] Comparison of learning performance and retrieval performance for support vector machines based relevance feedback document retrieval
    Onoda, Takashi
    Murata, Hiroshi
    Yamada, Seiji
    PROCEEDING OF THE 2007 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS, 2007, : 249 - +
  • [8] Support vector machines based active learning for the relevance feedback document retrieval
    Onoda, Takashi
    Murata, Hiroshi
    Yamada, Seiji
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS PROCEEDINGS, 2006, : 389 - +
  • [9] AI-Enhanced Social Work: Developing and Evaluating Retrieval-Augmented Generation (RAG) Support Systems
    Perron, Brian E.
    Hiltz, Barbara S.
    Khang, Erin M.
    Savas, Sue Ann
    JOURNAL OF SOCIAL WORK EDUCATION, 2025, 61 (01) : 3 - 13
  • [10] Incorporating window-based passage-level evidence in document retrieval
    Xi, WS
    Xu-Rong, R
    Khoo, CSG
    Lim, EP
    JOURNAL OF INFORMATION SCIENCE, 2001, 27 (02) : 73 - 80