Natural language processing for automated quantification of bone metastases reported in free-text bone scintigraphy reports

被引:17
|
作者
Groot, Olivier Q. [1 ,2 ]
Bongers, Michiel E. R. [1 ]
Karhade, Aditya V. [1 ]
Kapoor, Neal D. [1 ]
Fenn, Brian P. [1 ]
Kim, Jason [1 ]
Verlaan, J. J. [2 ]
Schwab, Joseph H. [1 ]
机构
[1] Harvard Med Sch, Massachusetts Gen Hosp, Orthopaed Oncol Serv, Dept Orthopaed Surg, 55 Fruit St, Boston, MA 02114 USA
[2] Univ Utrecht, Univ Med Ctr Utrecht, Dept Orthopaed Surg, Utrecht, Netherlands
基金
美国国家卫生研究院;
关键词
D O I
10.1080/0284186X.2020.1819563
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However, processing this rich resource of data for clinical and research purposes, depends on labor-intensive and potentially error-prone manual review. The aim of this study was to develop a natural language processing (NLP) algorithm for binary classification (single metastasis versus two or more metastases) in bone scintigraphy reports of patients undergoing surgery for bone metastases. Material and methods Bone scintigraphy reports of patients undergoing surgery for bone metastases were labeled each by three independent reviewers using a binary classification (single metastasis versus two or more metastases) to establish a ground truth. A stratified 80:20 split was used to develop and test an extreme-gradient boosting supervised machine learning NLP algorithm. Results A total of 704 free-text bone scintigraphy reports from 704 patients were included in this study and 617 (88%) had multiple bone metastases. In the independent test set (n = 141) not used for model development, the NLP algorithm achieved an 0.97 AUC-ROC (95% confidence interval [CI], 0.92-0.99) for classification of multiple bone metastases and an 0.99 AUC-PRC (95% CI, 0.99-0.99). At a threshold of 0.90, NLP algorithm correctly identified multiple bone metastases in 117 of the 124 who had multiple bone metastases in the testing cohort (sensitivity 0.94) and yielded 3 false positives (specificity 0.82). At the same threshold, the NLP algorithm had a positive predictive value of 0.97 and F1-score of 0.96. Conclusions NLP has the potential to automate clinical data extraction from free text radiology notes in orthopedics, thereby optimizing the speed, accuracy, and consistency of clinical chart review. Pending external validation, the NLP algorithm developed in this study may be implemented as a means to aid researchers in tackling large amounts of data.
引用
收藏
页码:1455 / 1460
页数:6
相关论文
共 50 条
  • [41] Automated identification of lymphoma involving the bone from PET/CT reports using natural language processing and adaptive learning.
    Navitski, Anastasia
    Goyal, Piyush
    Ahsanuddin, Salma
    Zheng, Serena
    Joffe, Erel
    JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (15)
  • [42] Targeted generative data augmentation for automatic metastases detection from free-text radiology reports
    Barabadi, Maede Ashofteh
    Zhu, Xiaodan
    Chan, Wai Yip
    Simpson, Amber L.
    Do, Richard K. G.
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 8
  • [43] How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
    Puts, Sander
    Nobel, Martijn
    Zegers, Catharina
    Bermejo, Inigo
    Robben, Simon
    Dekker, Andre
    JMIR FORMATIVE RESEARCH, 2023, 7
  • [44] Natural language processing for automatic evaluation of free-text answers - a feasibility study based on the European Diploma in Radiology examination
    Stoehr, Fabian
    Kaempgen, Benedikt
    Mueller, Lukas
    Zufiria, Laura Oleaga
    Junquero, Vanesa
    Merino, Cristina
    Mildenberger, Peter
    Kloeckner, Roman
    INSIGHTS INTO IMAGING, 2023, 14 (01)
  • [45] Natural language processing for automatic evaluation of free-text answers — a feasibility study based on the European Diploma in Radiology examination
    Fabian Stoehr
    Benedikt Kämpgen
    Lukas Müller
    Laura Oleaga Zufiría
    Vanesa Junquero
    Cristina Merino
    Peter Mildenberger
    Roman Kloeckner
    Insights into Imaging, 14
  • [46] Using Natural Language Processing on Free-Text Clinical Notes to Identify Patients with Long-Term COVID Effects
    Zhu, Yuanda
    Mahale, Aishwarya
    Peters, Kourtney
    Mathew, Lejy
    Giuste, Felipe
    Anderson, Blake
    Wang, May D.
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [47] Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research
    Loepprich, Martin
    Krauss, Felix
    Ganzinger, Matthias
    Senghas, Karsten
    Riezler, Stefan
    Knaup, Petra
    METHODS OF INFORMATION IN MEDICINE, 2016, 55 (04) : 373 - 380
  • [48] Natural Language Processing Algorithm Used for Staging Pulmonary Oncology from Free-Text Radiological Reports: "Including PET-CT and Validation Towards Clinical Use"
    Nobel, J. Martijn
    Puts, Sander
    Krdzalic, Jasenko
    Zegers, Karen M. L.
    Lobbes, Marc B. I.
    Robben, Simon G. F.
    Dekker, Andre L. A. J.
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (01): : 3 - 12
  • [49] Analysis of Rheumatic Patients' Self - Reported Symptoms in Free Written Text Using Natural Language Processing
    Perez-Sancristobal, Ines
    Steinz, Nils
    Qin, Ling
    Maarseveen, Tjardo
    Zegers, Floor
    Knevel, Rachel
    ARTHRITIS & RHEUMATOLOGY, 2024, 76 : 2136 - 2138
  • [50] Machine Learning-Based Natural Language Processing for Automated Extraction and Standardized Annotation of IHC Results from Free Text Pathology Reports
    Kim, Young Suk
    Roehrl, Michael H. A.
    LABORATORY INVESTIGATION, 2019, 99