Natural language processing for automated quantification of bone metastases reported in free-text bone scintigraphy reports

被引：17

作者：

Groot, Olivier Q. ^{[1
,2
]}

Bongers, Michiel E. R. ^{[1
]}

Karhade, Aditya V. ^{[1
]}

Kapoor, Neal D. ^{[1
]}

Fenn, Brian P. ^{[1
]}

Kim, Jason ^{[1
]}

Verlaan, J. J. ^{[2
]}

Schwab, Joseph H. ^{[1
]}

机构：

[1] Harvard Med Sch, Massachusetts Gen Hosp, Orthopaed Oncol Serv, Dept Orthopaed Surg, 55 Fruit St, Boston, MA 02114 USA

[2] Univ Utrecht, Univ Med Ctr Utrecht, Dept Orthopaed Surg, Utrecht, Netherlands

来源：

ACTA ONCOLOGICA | 2020年 / 59卷 / 12期

基金：

美国国家卫生研究院;

关键词：

D O I：

10.1080/0284186X.2020.1819563

中图分类号：

R73 [肿瘤学];

学科分类号：

100214 ;

摘要：

Background The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However, processing this rich resource of data for clinical and research purposes, depends on labor-intensive and potentially error-prone manual review. The aim of this study was to develop a natural language processing (NLP) algorithm for binary classification (single metastasis versus two or more metastases) in bone scintigraphy reports of patients undergoing surgery for bone metastases. Material and methods Bone scintigraphy reports of patients undergoing surgery for bone metastases were labeled each by three independent reviewers using a binary classification (single metastasis versus two or more metastases) to establish a ground truth. A stratified 80:20 split was used to develop and test an extreme-gradient boosting supervised machine learning NLP algorithm. Results A total of 704 free-text bone scintigraphy reports from 704 patients were included in this study and 617 (88%) had multiple bone metastases. In the independent test set (n = 141) not used for model development, the NLP algorithm achieved an 0.97 AUC-ROC (95% confidence interval [CI], 0.92-0.99) for classification of multiple bone metastases and an 0.99 AUC-PRC (95% CI, 0.99-0.99). At a threshold of 0.90, NLP algorithm correctly identified multiple bone metastases in 117 of the 124 who had multiple bone metastases in the testing cohort (sensitivity 0.94) and yielded 3 false positives (specificity 0.82). At the same threshold, the NLP algorithm had a positive predictive value of 0.97 and F1-score of 0.96. Conclusions NLP has the potential to automate clinical data extraction from free text radiology notes in orthopedics, thereby optimizing the speed, accuracy, and consistency of clinical chart review. Pending external validation, the NLP algorithm developed in this study may be implemented as a means to aid researchers in tackling large amounts of data.

引用

页码：1455 / 1460

页数：6

共 50 条

[41] Automated identification of lymphoma involving the bone from PET/CT reports using natural language processing and adaptive learning.
Navitski, Anastasia
Goyal, Piyush
Ahsanuddin, Salma
Zheng, Serena
Joffe, Erel
JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (15)
[42] Targeted generative data augmentation for automatic metastases detection from free-text radiology reports
Barabadi, Maede Ashofteh
Zhu, Xiaodan
Chan, Wai Yip
Simpson, Amber L.
Do, Richard K. G.
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 8
[43] How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
Puts, Sander
Nobel, Martijn
Zegers, Catharina
Bermejo, Inigo
Robben, Simon
Dekker, Andre
JMIR FORMATIVE RESEARCH, 2023, 7
[44] Natural language processing for automatic evaluation of free-text answers - a feasibility study based on the European Diploma in Radiology examination
Stoehr, Fabian
Kaempgen, Benedikt
Mueller, Lukas
Zufiria, Laura Oleaga
Junquero, Vanesa
Merino, Cristina
Mildenberger, Peter
Kloeckner, Roman
INSIGHTS INTO IMAGING, 2023, 14 (01)
[45] Natural language processing for automatic evaluation of free-text answers — a feasibility study based on the European Diploma in Radiology examination
Fabian Stoehr
Benedikt Kämpgen
Lukas Müller
Laura Oleaga Zufiría
Vanesa Junquero
Cristina Merino
Peter Mildenberger
Roman Kloeckner
Insights into Imaging, 14
[46] Using Natural Language Processing on Free-Text Clinical Notes to Identify Patients with Long-Term COVID Effects
Zhu, Yuanda
Mahale, Aishwarya
Peters, Kourtney
Mathew, Lejy
Giuste, Felipe
Anderson, Blake
Wang, May D.
13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
[47] Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research
Loepprich, Martin
Krauss, Felix
Ganzinger, Matthias
Senghas, Karsten
Riezler, Stefan
Knaup, Petra
METHODS OF INFORMATION IN MEDICINE, 2016, 55 (04) : 373 - 380
[48] Natural Language Processing Algorithm Used for Staging Pulmonary Oncology from Free-Text Radiological Reports: "Including PET-CT and Validation Towards Clinical Use"
Nobel, J. Martijn
Puts, Sander
Krdzalic, Jasenko
Zegers, Karen M. L.
Lobbes, Marc B. I.
Robben, Simon G. F.
Dekker, Andre L. A. J.
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (01): : 3 - 12
[49] Analysis of Rheumatic Patients' Self - Reported Symptoms in Free Written Text Using Natural Language Processing
Perez-Sancristobal, Ines
Steinz, Nils
Qin, Ling
Maarseveen, Tjardo
Zegers, Floor
Knevel, Rachel
ARTHRITIS & RHEUMATOLOGY, 2024, 76 : 2136 - 2138
[50] Machine Learning-Based Natural Language Processing for Automated Extraction and Standardized Annotation of IHC Results from Free Text Pathology Reports
Kim, Young Suk
Roehrl, Michael H. A.
LABORATORY INVESTIGATION, 2019, 99

← 1 2 3 4 5 →