Natural language processing for automated quantification of bone metastases reported in free-text bone scintigraphy reports

被引:17
|
作者
Groot, Olivier Q. [1 ,2 ]
Bongers, Michiel E. R. [1 ]
Karhade, Aditya V. [1 ]
Kapoor, Neal D. [1 ]
Fenn, Brian P. [1 ]
Kim, Jason [1 ]
Verlaan, J. J. [2 ]
Schwab, Joseph H. [1 ]
机构
[1] Harvard Med Sch, Massachusetts Gen Hosp, Orthopaed Oncol Serv, Dept Orthopaed Surg, 55 Fruit St, Boston, MA 02114 USA
[2] Univ Utrecht, Univ Med Ctr Utrecht, Dept Orthopaed Surg, Utrecht, Netherlands
基金
美国国家卫生研究院;
关键词
D O I
10.1080/0284186X.2020.1819563
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background The widespread use of electronic patient-generated health data has led to unprecedented opportunities for automated extraction of clinical features from free-text medical notes. However, processing this rich resource of data for clinical and research purposes, depends on labor-intensive and potentially error-prone manual review. The aim of this study was to develop a natural language processing (NLP) algorithm for binary classification (single metastasis versus two or more metastases) in bone scintigraphy reports of patients undergoing surgery for bone metastases. Material and methods Bone scintigraphy reports of patients undergoing surgery for bone metastases were labeled each by three independent reviewers using a binary classification (single metastasis versus two or more metastases) to establish a ground truth. A stratified 80:20 split was used to develop and test an extreme-gradient boosting supervised machine learning NLP algorithm. Results A total of 704 free-text bone scintigraphy reports from 704 patients were included in this study and 617 (88%) had multiple bone metastases. In the independent test set (n = 141) not used for model development, the NLP algorithm achieved an 0.97 AUC-ROC (95% confidence interval [CI], 0.92-0.99) for classification of multiple bone metastases and an 0.99 AUC-PRC (95% CI, 0.99-0.99). At a threshold of 0.90, NLP algorithm correctly identified multiple bone metastases in 117 of the 124 who had multiple bone metastases in the testing cohort (sensitivity 0.94) and yielded 3 false positives (specificity 0.82). At the same threshold, the NLP algorithm had a positive predictive value of 0.97 and F1-score of 0.96. Conclusions NLP has the potential to automate clinical data extraction from free text radiology notes in orthopedics, thereby optimizing the speed, accuracy, and consistency of clinical chart review. Pending external validation, the NLP algorithm developed in this study may be implemented as a means to aid researchers in tackling large amounts of data.
引用
收藏
页码:1455 / 1460
页数:6
相关论文
共 50 条
  • [31] Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
    Chaichulee, Sitthichok
    Promchai, Chissanupong
    Kaewkomon, Tanyamai
    Kongkamol, Chanon
    Ingviya, Thammasin
    Sangsupawanich, Pasuree
    PLOS ONE, 2022, 17 (08):
  • [32] askMEDLINE: A free-text, natural language query tool for MEDLINE/PubMed
    Fontelo P.
    Liu F.
    Ackerman M.
    BMC Medical Informatics and Decision Making, 5 (1)
  • [33] Natural Language Processing of Radiology Text Reports: Interactive Text Classification
    Wiggins, Walter F.
    Kitamura, Felipe
    Santos, Igor
    Prevedello, Luciano M.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (04)
  • [34] Deep-Transfer-Learning-Based Natural Language Processing of Serial Free-Text Computed Tomography Reports for Predicting Survival of Patients With Pancreatic Cancer
    Kim, Sunkyu
    Kim, Seung-seob
    Kim, Eejung
    Cecchini, Michael
    Park, Mi-Suk
    Choi, Ji A.
    Kim, Sung Hyun
    Hwang, Ho Kyoung
    Kang, Chang Moo
    Choi, Hye Jin
    Shin, Sang Joon
    Kang, Jaewoo
    Lee, Choong-kun
    JCO CLINICAL CANCER INFORMATICS, 2024, 8
  • [35] Automating Clinical Chart Review: An Open-Source Natural Language Processing Pipeline Developed on Free-Text Radiology Reports From Patients With Glioblastoma
    Senders, Joeky T.
    Cho, Logan D.
    Calvachi, Paola
    McNulty, John J.
    Ashby, Joanna L.
    Schulte, Isabelle S.
    Almekkawi, Ahmad Kareem
    Mehrtash, Alireza
    Gormley, William B.
    Smith, Timothy R.
    Broekman, Marike L. D.
    Arnaout, Omar
    JCO CLINICAL CANCER INFORMATICS, 2020, 4 : 25 - 34
  • [36] Extracting Clinical Information from Free-text of Pathology and Operation Notes via Chinese Natural Language Processing
    Zeng, Qiang
    Zhang, Xiaoyan
    Zhang, Weide
    Li, Zuofeng
    Liu, Lei
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 593 - 597
  • [37] Extracting Symptoms of Agitation in Dementia from Free-Text Nursing Notes Using Advanced Natural Language Processing
    Vithanage, Dinithi
    Zhu, Yunshu
    Zhang, Zhenyu
    Deng, Chao
    Yin, Mengyang
    Yu, Ping
    MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 700 - 704
  • [38] Deep-Transfer-Learning-Based Natural Language Processing of Serial Free-Text Computed Tomography Reports for Predicting Survival of Patients With Pancreatic Cancer
    Kim, Sunkyu
    Kim, Seung-seob
    Kim, Eejung
    Cecchini, Michael
    Park, Mi-Suk
    Choi, Ji A.
    Kim, Sung Hyun
    Hwang, Ho Kyoung
    Kang, Chang Moo
    Choi, Hye Jin
    Shin, Sang Joon
    Kang, Jaewoo
    Lee, Choong-kun
    JCO CLINICAL CANCER INFORMATICS, 2024, 8
  • [39] Prognosis of p16 and Human Papillomavirus Discordant Oropharyngeal Cancers and the Exploration of Using Natural Language Processing to Analyze Free-Text Pathology Reports
    Shin, Ethan
    Choi, Justin
    Hung, Tony K. W.
    Poon, Chester
    Riaz, Nadeem
    Yu, Yao
    Kang, Jung Julie
    JCO CLINICAL CANCER INFORMATICS, 2025, 9
  • [40] Refinement of a Generalized Natural Language Processing Algorithm for the Identification of Clinical Terms from Free-Text Clinical Notes
    Nunes, Anthony P.
    Mortimer, Kathleen M.
    Loughlin, Jeanne
    Wang, Florence T.
    Dore, David D.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2015, 24 : 536 - 537