Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports

被引:12
|
作者
Le Guellec, Bastien [1 ,2 ]
Lefevre, Alexandre [1 ]
Geay, Charlotte [3 ]
Shorten, Lucas [3 ]
Bruge, Cyril [1 ]
Hacein-Bey, Lotfi [4 ]
Amouyel, Philippe [2 ,5 ]
Pruvo, Jean-Pierre [1 ,6 ,7 ]
Kuchcinski, Gregory [1 ,6 ,7 ]
Hamroun, Aghiles [2 ,5 ]
机构
[1] Univ Lille, Dept Neuroradiol, CHU Lille, Rue Emile Laine, F-59000 Lille, France
[2] Univ Lille, Dept Publ Hlth, CHU Lille, Rue Emile Laine, F-59000 Lille, France
[3] Univ Lille, CHU Lille, INclude Hlth Data Warehouse, Rue Emile Laine, F-59000 Lille, France
[4] UC Davis Hlth, Dept Radiol, Sacramento, CA 95817 USA
[5] Univ Lille, CHU Lille, Inst Pasteur Lille,Inserm, RID AGE Facteurs Ris & Determinants Mol Malad Liee, Lille, France
[6] Univ Lille, INSERM, LilNCog Lille Neurosci & Cognit U1172, Lille, France
[7] Univ Lille, Plateformes Lilloises Biol & Sante, UAR 2014, US 41,PLBS, Lille, France
关键词
D O I
10.1148/ryai.230364
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Purpose: To assess the performance of a local open-source large language model (LLM) in various information extraction tasks from real-life emergency brain MRI reports. Materials and Methods: All consecutive emergency brain MRI reports written in 2022 from a French quaternary center were retrospectively reviewed. Two radiologists identified MRI scans that were performed in the emergency department for headaches. Four radiologists scored the reports' conclusions as either normal or abnormal. Abnormalities were labeled as either headache-causing or incidental. Vicuna (LMSYS Org), an open-source LLM, performed the same tasks. Vicuna's performance metrics were evaluated using the radiologists' consensus as the reference standard. Results: Among the 2398 reports during the study period, radiologists identified 595 that included headaches in the indication (median age of patients, 35 years [IQR, 26-51 years]; 68% [403 of 595] women). A positive finding was reported in 227 of 595 (38%) cases, 136 of which could explain the headache. The LLM had a sensitivity of 98.0% (95% CI: 96.5, 99.0) and specificity of 99.3% (95% CI: 98.8, 99.7) for detecting the presence of headache in the clinical context, a sensitivity of 99.4% (95% CI: 98.3, 99.9) and specificity of 98.6% (95% CI: 92.2, 100.0) for the use of contrast medium injection, a sensitivity of 96.0% (95% CI: 92.5, 98.2) and specificity of 98.9% (95% CI: 97.2, 99.7) for study categorization as either normal or abnormal, and a sensitivity of 88.2% (95% CI: 81.6, 93.1) and specificity of 73% (95% CI: 62, 81) for causal inference between MRI findings and headache. Conclusion: An open-source LLM was able to extract information from free-text radiology reports with excellent accuracy without requiring further training.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Evaluating Patients' Experiences with Healthcare Services: Extracting Domain and Language-Specific Information from Free-Text Narratives
    Jacennik, Barbara
    Zawadzka-Gosk, Emilia
    Moreira, Joaquim Paulo
    Glinkowski, Wojciech Michal
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (16)
  • [22] Development of a Method for Extracting Structured Dose Information from Free-Text Electronic Prescriptions
    Liang, Man Qing
    Gidla, Vivek
    Verma, Aman
    Weir, Daniala
    Tamblyn, Robyn
    Buckeridge, David
    Motulsky, Aude
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 1568 - 1569
  • [23] Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study
    Yu, Amy Y. X.
    Liu, Zhongyu A.
    Pou-Prom, Chloe
    Lopes, Kaitlyn
    Kapral, Moira K.
    Aviv, Richard, I
    Mamdani, Muhammad
    JMIR MEDICAL INFORMATICS, 2021, 9 (05)
  • [24] Automated Identification and Measurement Extraction of Pancreatic Cystic Lesions from Free-Text Radiology Reports Using Natural Language Processing
    Yamashita, Rikiya
    Bird, Kristen
    Cheung, Philip Yue-Cheng
    Decker, Johannes Hugo
    Flory, Marta Nicole
    Goff, Daniel
    Morimoto, Linda Nayeli
    Shon, Andy
    Wentland, Andrew Louis
    Rubin, Daniel L.
    Desser, Terry S.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (02)
  • [25] Extracting socio-cultural networks of the Sudan from open-source, large-scale text data
    Diesner, Jana
    Carley, Kathleen M.
    Tambayong, Laurent
    COMPUTATIONAL AND MATHEMATICAL ORGANIZATION THEORY, 2012, 18 (03) : 328 - 339
  • [26] Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes
    Huhdanpaa, Hannu T.
    Tan, W. Katherine
    Rundell, Sean D.
    Suri, Pradeep
    Chokshi, Falgun H.
    Comstock, Bryan A.
    Heagerty, Patrick J.
    James, Kathryn T.
    Avins, Andrew L.
    Nedeljkovic, Srdjan S.
    Nerenz, David R.
    Kallmes, David F.
    Luetmer, Patrick H.
    Sherman, Karen J.
    Organ, Nancy L.
    Griffith, Brent
    Langlotz, Curtis P.
    Carrell, David
    Hassanpour, Saeed
    Jarvik, Jeffrey G.
    JOURNAL OF DIGITAL IMAGING, 2018, 31 (01) : 84 - 90
  • [27] Targeted generative data augmentation for automatic metastases detection from free-text radiology reports
    Barabadi, Maede Ashofteh
    Zhu, Xiaodan
    Chan, Wai Yip
    Simpson, Amber L.
    Do, Richard K. G.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 8
  • [28] Extracting socio-cultural networks of the Sudan from open-source, large-scale text data
    Jana Diesner
    Kathleen M. Carley
    Laurent Tambayong
    Computational and Mathematical Organization Theory, 2012, 18 : 328 - 339
  • [29] Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes
    Hannu T. Huhdanpaa
    W. Katherine Tan
    Sean D. Rundell
    Pradeep Suri
    Falgun H. Chokshi
    Bryan A. Comstock
    Patrick J. Heagerty
    Kathryn T. James
    Andrew L. Avins
    Srdjan S. Nedeljkovic
    David R. Nerenz
    David F. Kallmes
    Patrick H. Luetmer
    Karen J. Sherman
    Nancy L. Organ
    Brent Griffith
    Curtis P. Langlotz
    David Carrell
    Saeed Hassanpour
    Jeffrey G. Jarvik
    Journal of Digital Imaging, 2018, 31 : 84 - 90
  • [30] Privacy-ensuring Open-weights Large Language Models Are Competitive with Closed-weights GPT-4o in Extracting Chest Radiography Findings from Free-Text Reports
    Nowak, Sebastian
    Wulff, Benjamin
    Layer, Yannik C.
    Theis, Maike
    Isaak, Alexander
    Salam, Babak
    Block, Wolfgang
    Kuetting, Daniel
    Pieper, Claus C.
    Luetkens, Julian A.
    Attenberger, Ulrike
    Sprinkart, Alois M.
    RADIOLOGY, 2025, 314 (01)