Natural language processing for automated quantification of bone metastases reported in free-text bone scintigraphy reports

被引:17
|
作者
Groot, Olivier Q. [1 ,2 ]
Bongers, Michiel E. R. [1 ]
Karhade, Aditya V. [1 ]
Kapoor, Neal D. [1 ]
Fenn, Brian P. [1 ]
Kim, Jason [1 ]
Verlaan, J. J. [2 ]
Schwab, Joseph H. [1 ]
机构
[1] Harvard Med Sch, Massachusetts Gen Hosp, Orthopaed Oncol Serv, Dept Orthopaed Surg, 55 Fruit St, Boston, MA 02114 USA
[2] Univ Utrecht, Univ Med Ctr Utrecht, Dept Orthopaed Surg, Utrecht, Netherlands
基金
美国国家卫生研究院;
关键词
D O I
10.1080/0284186X.2020.1819563
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However, processing this rich resource of data for clinical and research purposes, depends on labor-intensive and potentially error-prone manual review. The aim of this study was to develop a natural language processing (NLP) algorithm for binary classification (single metastasis versus two or more metastases) in bone scintigraphy reports of patients undergoing surgery for bone metastases. Material and methods Bone scintigraphy reports of patients undergoing surgery for bone metastases were labeled each by three independent reviewers using a binary classification (single metastasis versus two or more metastases) to establish a ground truth. A stratified 80:20 split was used to develop and test an extreme-gradient boosting supervised machine learning NLP algorithm. Results A total of 704 free-text bone scintigraphy reports from 704 patients were included in this study and 617 (88%) had multiple bone metastases. In the independent test set (n = 141) not used for model development, the NLP algorithm achieved an 0.97 AUC-ROC (95% confidence interval [CI], 0.92-0.99) for classification of multiple bone metastases and an 0.99 AUC-PRC (95% CI, 0.99-0.99). At a threshold of 0.90, NLP algorithm correctly identified multiple bone metastases in 117 of the 124 who had multiple bone metastases in the testing cohort (sensitivity 0.94) and yielded 3 false positives (specificity 0.82). At the same threshold, the NLP algorithm had a positive predictive value of 0.97 and F1-score of 0.96. Conclusions NLP has the potential to automate clinical data extraction from free text radiology notes in orthopedics, thereby optimizing the speed, accuracy, and consistency of clinical chart review. Pending external validation, the NLP algorithm developed in this study may be implemented as a means to aid researchers in tackling large amounts of data.
引用
收藏
页码:1455 / 1460
页数:6
相关论文
共 50 条
  • [1] Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports
    Senders, Joeky T.
    Karhade, Aditya V.
    Cote, David J.
    Mehrtash, Alireza
    Lamba, Nayan
    DiRisio, Aislyn
    Muskens, Ivo S.
    Gormley, William B.
    Smith, Timothy R.
    Broekman, Marike L. D.
    Arnaout, Omar
    JCO CLINICAL CANCER INFORMATICS, 2019, 3 : 1 - 9
  • [2] NATURAL LANGUAGE PROCESSING FOR THE AUTOMATED QUANTIFICATION OF BRAIN METASTASES IN RADIOLOGY FREE TEXT REPORTS
    Cote, David
    Senders, Joeky
    Karhade, Aditya
    Gupta, Saksham
    Lamba, Nayan
    Hancock, Brooke
    Smith, Timothy
    Arnaout, Omar
    NEURO-ONCOLOGY, 2017, 19 : 44 - 45
  • [3] Natural Language Processing Approaches for Automated Multilevel and Multiclass Classification of Breast Lesions on Free-Text Cytopathology Reports
    Nandish, Sonali
    Prathibha, R. J.
    Nandini, N. M.
    JCO CLINICAL CANCER INFORMATICS, 2022, 6 : e2200036
  • [4] Evaluation of large language models in natural language processing of PET/CT free-text reports
    Bradshaw, Tyler
    Cho, Steve
    JOURNAL OF NUCLEAR MEDICINE, 2021, 62
  • [5] A Natural Language Processing Pipeline of Chinese Free-Text Radiology Reports for Liver Cancer Diagnosis
    Liu, Honglei
    Xu, Yan
    Zhang, Zhiqiang
    Wang, Ni
    Huang, Yanqun
    Hu, Yanjun
    Yang, Zhenghan
    Jiang, Rui
    Chen, Hui
    IEEE ACCESS, 2020, 8 : 159110 - 159119
  • [6] Deep learning for natural language processing of free-text pathology reports: a comparison of learning curves
    Senders, Joeky T.
    Cote, David J.
    Mehrtash, Alireza
    Wiemann, Robert
    Gormley, William B.
    Smith, Timothy R.
    Broekman, Marike L. D.
    Arnaout, Omar
    BMJ INNOVATIONS, 2020, 6 (04) : 192 - 198
  • [7] A Natural Language Processing and deep learning based model for automated vehicle diagnostics using free-text customer service reports
    Khodadadi, Ali
    Ghandiparsi, Soroush
    Chuah, Chen-Nee
    MACHINE LEARNING WITH APPLICATIONS, 2022, 10
  • [8] Automated Identification and Measurement Extraction of Pancreatic Cystic Lesions from Free-Text Radiology Reports Using Natural Language Processing
    Yamashita, Rikiya
    Bird, Kristen
    Cheung, Philip Yue-Cheng
    Decker, Johannes Hugo
    Flory, Marta Nicole
    Goff, Daniel
    Morimoto, Linda Nayeli
    Shon, Andy
    Wentland, Andrew Louis
    Rubin, Daniel L.
    Desser, Terry S.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2022, 4 (02)
  • [9] A natural language processing pipeline for pairing measurements uniquely across free-text CT reports
    Sevenster, Merlijn
    Bozeman, Jeffrey
    Cowhy, Andrea
    Trost, William
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 53 : 36 - 48
  • [10] Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes
    Huhdanpaa, Hannu T.
    Tan, W. Katherine
    Rundell, Sean D.
    Suri, Pradeep
    Chokshi, Falgun H.
    Comstock, Bryan A.
    Heagerty, Patrick J.
    James, Kathryn T.
    Avins, Andrew L.
    Nedeljkovic, Srdjan S.
    Nerenz, David R.
    Kallmes, David F.
    Luetmer, Patrick H.
    Sherman, Karen J.
    Organ, Nancy L.
    Griffith, Brent
    Langlotz, Curtis P.
    Carrell, David
    Hassanpour, Saeed
    Jarvik, Jeffrey G.
    JOURNAL OF DIGITAL IMAGING, 2018, 31 (01) : 84 - 90