Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports

被引:12
|
作者
Le Guellec, Bastien [1 ,2 ]
Lefevre, Alexandre [1 ]
Geay, Charlotte [3 ]
Shorten, Lucas [3 ]
Bruge, Cyril [1 ]
Hacein-Bey, Lotfi [4 ]
Amouyel, Philippe [2 ,5 ]
Pruvo, Jean-Pierre [1 ,6 ,7 ]
Kuchcinski, Gregory [1 ,6 ,7 ]
Hamroun, Aghiles [2 ,5 ]
机构
[1] Univ Lille, Dept Neuroradiol, CHU Lille, Rue Emile Laine, F-59000 Lille, France
[2] Univ Lille, Dept Publ Hlth, CHU Lille, Rue Emile Laine, F-59000 Lille, France
[3] Univ Lille, CHU Lille, INclude Hlth Data Warehouse, Rue Emile Laine, F-59000 Lille, France
[4] UC Davis Hlth, Dept Radiol, Sacramento, CA 95817 USA
[5] Univ Lille, CHU Lille, Inst Pasteur Lille,Inserm, RID AGE Facteurs Ris & Determinants Mol Malad Liee, Lille, France
[6] Univ Lille, INSERM, LilNCog Lille Neurosci & Cognit U1172, Lille, France
[7] Univ Lille, Plateformes Lilloises Biol & Sante, UAR 2014, US 41,PLBS, Lille, France
关键词
D O I
10.1148/ryai.230364
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Purpose: To assess the performance of a local open-source large language model (LLM) in various information extraction tasks from real-life emergency brain MRI reports. Materials and Methods: All consecutive emergency brain MRI reports written in 2022 from a French quaternary center were retrospectively reviewed. Two radiologists identified MRI scans that were performed in the emergency department for headaches. Four radiologists scored the reports' conclusions as either normal or abnormal. Abnormalities were labeled as either headache-causing or incidental. Vicuna (LMSYS Org), an open-source LLM, performed the same tasks. Vicuna's performance metrics were evaluated using the radiologists' consensus as the reference standard. Results: Among the 2398 reports during the study period, radiologists identified 595 that included headaches in the indication (median age of patients, 35 years [IQR, 26-51 years]; 68% [403 of 595] women). A positive finding was reported in 227 of 595 (38%) cases, 136 of which could explain the headache. The LLM had a sensitivity of 98.0% (95% CI: 96.5, 99.0) and specificity of 99.3% (95% CI: 98.8, 99.7) for detecting the presence of headache in the clinical context, a sensitivity of 99.4% (95% CI: 98.3, 99.9) and specificity of 98.6% (95% CI: 92.2, 100.0) for the use of contrast medium injection, a sensitivity of 96.0% (95% CI: 92.5, 98.2) and specificity of 98.9% (95% CI: 97.2, 99.7) for study categorization as either normal or abnormal, and a sensitivity of 88.2% (95% CI: 81.6, 93.1) and specificity of 73% (95% CI: 62, 81) for causal inference between MRI findings and headache. Conclusion: An open-source LLM was able to extract information from free-text radiology reports with excellent accuracy without requiring further training.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Text2VQL: Teaching a Model Query Language to Open-Source Language Models with ChatGPT
    Lopez, Jose Antonio Hernandez
    Foldiak, Mate
    Varro, Daniel
    27TH INTERNATIONAL ACM/IEEE CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS, 2024, : 13 - 24
  • [42] Use of a Large Language Model to Identify and Classify Injuries With Free-Text Emergency Department Data
    Lorenzoni, Giulia
    Gregori, Dario
    Bressan, Silvia
    Ocagli, Honoria
    Azzolina, Danila
    Da Dalt, Liviana
    Berchialla, Paola
    JAMA NETWORK OPEN, 2024, 7 (05) : E2413208
  • [43] RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
    Lu, Yao
    Liu, Shang
    Zhang, Qijun
    Xie, Zhiyao
    29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 722 - 727
  • [44] Extracting Symptoms of Agitation in Dementia from Free-Text Nursing Notes Using Advanced Natural Language Processing
    Vithanage, Dinithi
    Zhu, Yunshu
    Zhang, Zhenyu
    Deng, Chao
    Yin, Mengyang
    Yu, Ping
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 700 - 704
  • [45] ChatGPT yields low accuracy in determining LI-RADS scores based on free-text and structured radiology reports in German language
    Fervers, Philipp
    Hahnfeldt, Robert
    Kottlors, Jonathan
    Wagner, Anton
    Maintz, David
    dos Santos, Daniel Pinto
    Lennartz, Simon
    Persigehl, Thorsten
    FRONTIERS IN RADIOLOGY, 2024, 4
  • [46] Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer
    Choi, Hyeon Seok
    Song, Jun Yeong
    Shin, Kyung Hwan
    Chang, Ji Hyun
    Jang, Bum-Sup
    RADIATION ONCOLOGY JOURNAL, 2023, 41 (03): : 209 - 216
  • [47] Automated classification of limb fractures from free-text radiology reports using a clinician-informed gazetteer methodology
    Wagholikar, Amol
    Zuccon, Guido
    Nguyen, Anthony
    Chu, Kevin
    Martin, Shane
    Lai, Kim
    Greenslade, Jaimi
    AUSTRALASIAN MEDICAL JOURNAL, 2013, 6 (05): : 301 - 307
  • [49] How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
    Puts, Sander
    Nobel, Martijn
    Zegers, Catharina
    Bermejo, Inigo
    Robben, Simon
    Dekker, Andre
    JMIR FORMATIVE RESEARCH, 2023, 7
  • [50] OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks
    Sharma, Utsav
    Wu, Bing-Yue
    Kankipati, Sai Rahul Dhanvi
    Chhabria, Vidya A.
    Rovinski, Austin
    PROCEEDINGS OF THE 2024 ACM/IEEE INTERNATIONAL SYMPOSIUM ON MACHINE LEARNING FOR CAD, MLCAD 2024, 2024,