Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models

被引:4
|
作者
Iscoe, Mark [1 ,2 ]
Socrates, Vimig [2 ,3 ]
Gilson, Aidan [4 ]
Chi, Ling [5 ]
Li, Huan [3 ]
Huang, Thomas [4 ]
Kearns, Thomas [1 ]
Perkins, Rachelle [1 ]
Khandjian, Laura [1 ]
Taylor, R. Andrew [1 ,2 ]
机构
[1] Yale Sch Med, Dept Emergency Med, New Haven, CT 06519 USA
[2] Yale Univ, Sch Med, Sect Biomed Informat & Data Sci, New Haven, CT USA
[3] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT USA
[4] Yale Sch Med, New Haven, CT 06519 USA
[5] Yale Sch Publ Hlth, Dept Biostat, New Haven, CT USA
关键词
emergency medicine; infectious diseases; informatics; large language models; named entity recognition; natural language processing; urinary tract infection; INFORMATION; AGREEMENT; CARE; EXTRACTION; MANAGEMENT; ACCURACY; CRITERIA;
D O I
10.1111/acem.14883
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
BackgroundNatural language processing (NLP) tools including recently developed large language models (LLMs) have myriad potential applications in medical care and research, including the efficient labeling and classification of unstructured text such as electronic health record (EHR) notes. This opens the door to large-scale projects that rely on variables that are not typically recorded in a structured form, such as patient signs and symptoms.ObjectivesThis study is designed to acquaint the emergency medicine research community with the foundational elements of NLP, highlighting essential terminology, annotation methodologies, and the intricacies involved in training and evaluating NLP models. Symptom characterization is critical to urinary tract infection (UTI) diagnosis, but identification of symptoms from the EHR has historically been challenging, limiting large-scale research, public health surveillance, and EHR-based clinical decision support. We therefore developed and compared two NLP models to identify UTI symptoms from unstructured emergency department (ED) notes.MethodsThe study population consisted of patients aged >= 18 who presented to an ED in a northeastern U.S. health system between June 2013 and August 2021 and had a urinalysis performed. We annotated a random subset of 1250 ED clinician notes from these visits for a list of 17 UTI symptoms. We then developed two task-specific LLMs to perform the task of named entity recognition: a convolutional neural network-based model (SpaCy) and a transformer-based model designed to process longer documents (Clinical Longformer). Models were trained on 1000 notes and tested on a holdout set of 250 notes. We compared model performance (precision, recall, F1 measure) at identifying the presence or absence of UTI symptoms at the note level.ResultsA total of 8135 entities were identified in 1250 notes; 83.6% of notes included at least one entity. Overall F1 measure for note-level symptom identification weighted by entity frequency was 0.84 for the SpaCy model and 0.88 for the Longformer model. F1 measure for identifying presence or absence of any UTI symptom in a clinical note was 0.96 (232/250 correctly classified) for the SpaCy model and 0.98 (240/250 correctly classified) for the Longformer model.ConclusionsThe study demonstrated the utility of LLMs and transformer-based models in particular for extracting UTI symptoms from unstructured ED clinical notes; models were highly accurate for detecting the presence or absence of any UTI symptom on the note level, with variable performance for individual symptoms.
引用
收藏
页码:599 / 610
页数:12
相关论文
共 50 条
  • [31] Predicting signs of problem-gambling from online texts using large language models
    Smith, Elke
    Peters, Jan
    Reiter, Nils
    JOURNAL OF BEHAVIORAL ADDICTIONS, 2024, 13 : 89 - 89
  • [32] Large Language Models Identify Presenting Symptoms Associated With Infectious Diagnoses, Multi-Drug Resistant Organisms, and Mortality in Clinical Notes of Patients With Suspected Infection
    Pak, T. R.
    Kanjilal, S.
    McKenna, C.
    Rhee, C.
    Klompas, M.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2024, 209
  • [33] The Association between the L3 Skeletal Muscle Index Derived from Computed Tomography and Clinical Outcomes in Patients with Urinary Tract Infection in the Emergency Department
    An, Jinjoo
    Choi, Seung Pill
    Oh, Jae Hun
    Zhu, Jong Ho
    Kim, Sung Wook
    Kim, Soo Hyun
    JOURNAL OF CLINICAL MEDICINE, 2023, 12 (15)
  • [34] Urinary tract infection pocket card effect on preferred antimicrobial prescribing for cystitis among patients discharged from the emergency department
    Mixon, Mark Anthony
    Dietrich, Scott
    Bushong, Benjamin
    Peksa, Gary D.
    Rogoszewski, Ryan
    Theiler, Alexander
    Spears, Lindsey
    Werth, Joshua
    Meister, Erin
    Martin, Matthew Steven
    AMERICAN JOURNAL OF HEALTH-SYSTEM PHARMACY, 2021, 78 (15) : 1417 - 1425
  • [35] Using fine-tuned large language models to parse clinical notes in musculoskeletal pain disorders
    Vaid, Akhil
    Landi, Isotta
    Nadkarni, Girish
    Nabeel, Ismail
    LANCET DIGITAL HEALTH, 2023, 5 (12): : E855 - E858
  • [36] Building a Natural Language Processing Tool to Identify Patients With High Clinical Suspicion for Kawasaki Disease from Emergency Department Notes
    Doan, Son
    Maehara, Cleo K.
    Chaparro, Juan D.
    Lu, Sisi
    Liu, Ruiling
    Graham, Amanda
    Berry, Erika
    Hsu, Chun-Nan
    Kanegaye, John T.
    Lloyd, David D.
    Ohno-Machado, Lucila
    Burns, Jane C.
    Tremoulet, Adriana H.
    ACADEMIC EMERGENCY MEDICINE, 2016, 23 (05) : 628 - 636
  • [37] Decoding substance use disorder severity from clinical notes using a large language model
    Maria Mahbub
    Gregory M. Dams
    Sudarshan Srinivasan
    Caitlin Rizy
    Ioana Danciu
    Jodie Trafton
    Kathryn Knight
    npj Mental Health Research, 4 (1):
  • [38] Automated extraction of functional biomarkers of verbal and ambulatory ability from multi-institutional clinical notes using large language models
    Levi Kaster
    Ethan Hillis
    Inez Y. Oh
    Bhooma R. Aravamuthan
    Virginia C. Lanzotti
    Casey R. Vickstrom
    Christina A. Gurnett
    Philip R. O. Payne
    Aditi Gupta
    Journal of Neurodevelopmental Disorders, 17 (1)
  • [39] Performance of a large language model for identifying central line-associated bloodstream infections (CLABSI) using real clinical notes
    Rodriguez-Nava, Guillermo
    Egoryan, Goar
    Goodman, Katherine E.
    Morgan, Daniel J.
    Salinas, Jorge L.
    INFECTION CONTROL & HOSPITAL EPIDEMIOLOGY, 2025, 46 (03) : 305 - 308
  • [40] CHiLL: Zero-shot Custom Interpretable Feature Extraction from Clinical Notes with Large Language Models
    McInerney, Denis Jered
    Young, Geoffrey
    Van De Meent, Jan-Willem
    Wallace, Byron C.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8477 - 8494