Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引：0

作者：

Wilson Lau

Kevin Lybarger

Martin L. Gunn

Meliha Yetisgen

机构：

[1] University of Washington,Biomedical & Health Informatics, School of Medicine

[2] University of Washington,Department of Radiology, School of Medicine

来源：

Journal of Digital Imaging | 2023年 / 36卷

关键词：

Natural language processing; Information extraction; Event extraction; Deep learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.

引用

页码：91 / 104

页数：13

共 50 条

[1] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
Lau, Wilson
Lybarger, Kevin
Gunn, Martin L.
Yetisgen, Meliha
JOURNAL OF DIGITAL IMAGING, 2023, 36 (01) : 91 - 104
[2] Exploring Pre-trained Language Models for Event Extraction and Generation
Yang, Sen
Feng, Dawei
Qiao, Linbo
Kan, Zhigang
Li, Dongsheng
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
[3] Data Augmentation Based on Pre-trained Language Model for Event Detection
Zhang, Meng
Xie, Zhiwen
Liu, Jin
CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
[4] Event Evolution Analysis of Network Text Based on Pre-trained Language Model and Event Graph
Yang, Jinshun
Huang, Shuangxi
Huang, Mingfeng
COOPERATIVE DESIGN, VISUALIZATION, AND ENGINEERING, CDVE 2024, 2024, 15158 : 52 - 62
[5] Pre-trained language model augmented adversarial training network for Chinese clinical event detection
Zhang, Zhichang
Zhang, Minyu
Zhou, Tong
Qiu, Yanlong
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (04) : 2825 - 2841
[6] A Pre-trained Clinical Language Model for Acute Kidney Injury
Mao, Chengsheng
Yao, Liang
Luo, Yuan
2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 531 - 532
[7] Hyperbolic Pre-Trained Language Model
Chen, Weize
Han, Xu
Lin, Yankai
He, Kaichen
Xie, Ruobing
Zhou, Jie
Liu, Zhiyuan
Sun, Maosong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3101 - 3112
[8] Automatic TNM staging of colorectal cancer radiology reports using pre-trained language models
Chizhikova, Mariia
Lopez-ubeda, Pilar
Martin-Noguerol, Teodoro
Diaz-Galiano, Manuel C.
Urena-Lopez, L. Alfonso
Luna, Antonio
Martin-Valdivia, M. Teresa
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 259
[9] SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-Trained Language Model
Sun, Yi
Qiu, Hangping
Zheng, Yu
Wang, Zhongwei
Zhang, Chaoran
IEEE ACCESS, 2020, 8 : 10896 - 10906
[10] Mining Logical Event Schemas From Pre-Trained Language Models
Lawley, Lane
Schubert, Lenhart
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 332 - 345

← 1 2 3 4 5 →