Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引:0
|
作者
Wilson Lau
Kevin Lybarger
Martin L. Gunn
Meliha Yetisgen
机构
[1] University of Washington,Biomedical & Health Informatics, School of Medicine
[2] University of Washington,Department of Radiology, School of Medicine
来源
Journal of Digital Imaging | 2023年 / 36卷
关键词
Natural language processing; Information extraction; Event extraction; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
引用
收藏
页码:91 / 104
页数:13
相关论文
共 50 条
  • [1] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
    Lau, Wilson
    Lybarger, Kevin
    Gunn, Martin L.
    Yetisgen, Meliha
    JOURNAL OF DIGITAL IMAGING, 2023, 36 (01) : 91 - 104
  • [2] Exploring Pre-trained Language Models for Event Extraction and Generation
    Yang, Sen
    Feng, Dawei
    Qiao, Linbo
    Kan, Zhigang
    Li, Dongsheng
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
  • [3] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [4] Event Evolution Analysis of Network Text Based on Pre-trained Language Model and Event Graph
    Yang, Jinshun
    Huang, Shuangxi
    Huang, Mingfeng
    COOPERATIVE DESIGN, VISUALIZATION, AND ENGINEERING, CDVE 2024, 2024, 15158 : 52 - 62
  • [5] Pre-trained language model augmented adversarial training network for Chinese clinical event detection
    Zhang, Zhichang
    Zhang, Minyu
    Zhou, Tong
    Qiu, Yanlong
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (04) : 2825 - 2841
  • [6] A Pre-trained Clinical Language Model for Acute Kidney Injury
    Mao, Chengsheng
    Yao, Liang
    Luo, Yuan
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 531 - 532
  • [7] Hyperbolic Pre-Trained Language Model
    Chen, Weize
    Han, Xu
    Lin, Yankai
    He, Kaichen
    Xie, Ruobing
    Zhou, Jie
    Liu, Zhiyuan
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
  • [8] Automatic TNM staging of colorectal cancer radiology reports using pre-trained language models
    Chizhikova, Mariia
    Lopez-ubeda, Pilar
    Martin-Noguerol, Teodoro
    Diaz-Galiano, Manuel C.
    Urena-Lopez, L. Alfonso
    Luna, Antonio
    Martin-Valdivia, M. Teresa
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 259
  • [9] SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-Trained Language Model
    Sun, Yi
    Qiu, Hangping
    Zheng, Yu
    Wang, Zhongwei
    Zhang, Chaoran
    IEEE ACCESS, 2020, 8 : 10896 - 10906
  • [10] Mining Logical Event Schemas From Pre-Trained Language Models
    Lawley, Lane
    Schubert, Lenhart
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 332 - 345