Machine learning and natural language processing methods to identify ischemic stroke, acuity and location from radiology reports

被引:51
|
作者
Ong, Charlene Jennifer [1 ,2 ,3 ,4 ]
Orfanoudaki, Agni [4 ]
Zhang, Rebecca [4 ]
Caprasse, Francois Pierre M. [4 ]
Hutch, Meghan [1 ,2 ]
Ma, Liang [1 ]
Fard, Darian [1 ]
Balogun, Oluwafemi [1 ,2 ]
Miller, Matthew, I [1 ]
Minnig, Margaret [1 ]
Saglam, Hanife [3 ]
Prescott, Brenton [2 ]
Greer, David M. [1 ,2 ]
Smirnakis, Stelios [3 ]
Bertsimas, Dimitris [4 ,5 ]
机构
[1] Boston Univ, Sch Med, Boston, MA 02118 USA
[2] Boston Med Ctr, Boston, MA 02118 USA
[3] Harvard Med Sch, Boston, MA 02115 USA
[4] MIT, Operat Res Ctr, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[5] MIT, Sloan Sch Management, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
PLOS ONE | 2020年 / 15卷 / 06期
关键词
ANNOTATION;
D O I
10.1371/journal.pone.0234908
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate, automated extraction of clinical stroke information from unstructured text has several important applications. ICD-9/10 codes can misclassify ischemic stroke events and do not distinguish acuity or location. Expeditious, accurate data extraction could provide considerable improvement in identifying stroke in large datasets, triaging critical clinical reports, and quality improvement efforts. In this study, we developed and report a comprehensive framework studying the performance of simple and complex stroke-specific Natural Language Processing (NLP) and Machine Learning (ML) methods to determine presence, location, and acuity of ischemic stroke from radiographic text. We collected 60,564 Computed Tomography and Magnetic Resonance Imaging Radiology reports from 17,864 patients from two large academic medical centers. We used standard techniques to featurize unstructured text and developed neurovascular specific word GloVe embeddings. We trained various binary classification algorithms to identify stroke presence, location, and acuity using 75% of 1,359 expert-labeled reports. We validated our methods internally on the remaining 25% of reports and externally on 500 radiology reports from an entirely separate academic institution. In our internal population, GloVe word embeddings paired with deep learning (Recurrent Neural Networks) had the best discrimination of all methods for our three tasks (AUCs of 0.96, 0.98, 0.93 respectively). Simpler NLP approaches (Bag of Words) performed best with interpretable algorithms (Logistic Regression) for identifying ischemic stroke (AUC of 0.95), MCA location (AUC 0.96), and acuity (AUC of 0.90). Similarly, GloVe and Recurrent Neural Networks (AUC 0.92, 0.89, 0.93) generalized better in our external test set than BOW and Logistic Regression for stroke presence, location and acuity, respectively (AUC 0.89, 0.86, 0.80). Our study demonstrates a comprehensive assessment of NLP techniques for unstructured radiographic text. Our findings are suggestive that NLP/ML methods can be used to discriminate stroke features from large data cohorts for both clinical and research-related investigations.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Automated labelling of radiology reports using natural language processing: Comparison of traditional and newer methods
    Chng, Seo Yi
    Tern, Paul J. W.
    Kan, Matthew R. X.
    Cheng, Lionel T. E.
    HEALTH CARE SCIENCE, 2023, 2 (02): : 120 - 128
  • [32] Using Natural Language Processing and Machine Learning To Identify Gout Flares From Electronic Clinical Notes
    Zheng, Chengyi
    Rashid, Nazia
    Cheetham, T. Craig
    Wu, Yi-Lin
    Levy, Gerald D.
    ARTHRITIS AND RHEUMATISM, 2013, 65 : S856 - S857
  • [33] Using Natural Language Processing and Machine Learning to Identify Gout Flares From Electronic Clinical Notes
    Zheng, Chengyi
    Rashid, Nazia
    Wu, Yi-Lin
    Koblick, River
    Lin, Antony T.
    Levy, Gerald D.
    Cheetham, T. Craig
    ARTHRITIS CARE & RESEARCH, 2014, 66 (11) : 1740 - 1748
  • [34] Development of a Deep Learning Natural Language Processing Model for Classification of Lung Cancer Radiology Reports
    Mithun, S.
    Jha, A. K.
    Sherkhane, U. B.
    Jaiswar, V.
    Nautiyal, A.
    Purandare, N. C.
    Rangarajan, V.
    Dekker, A.
    Wee, L.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2021, 48 (SUPPL 1) : S330 - S330
  • [35] Natural Language Processing To Systematically Identify All Patients With Abnormal Pulmonary Imaging Findings In Radiology Text Reports
    Zeliadt, S. B.
    Hammond, K. W.
    Laundry, R.
    Takasugi, J. E.
    Feemster, L. C.
    Pham, E. H.
    Greene, P. A.
    Reinke, L. F.
    Dawadi, S.
    Au, D. H.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2016, 193
  • [36] Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports
    Kehl, Kenneth L.
    Elmarakeby, Haitham
    Nishino, Mizuki
    Van Allen, Eliezer M.
    Lepisto, Eva M.
    Hassett, Michael J.
    Johnson, Bruce E.
    Schrag, Deborah
    JAMA ONCOLOGY, 2019, 5 (10) : 1421 - 1429
  • [37] A framework based on Natural Language Processing and Machine Learning for the classification of the severity of road accidents from reports
    Valcamonico, Dario
    Baraldi, Piero
    Amigoni, Francesco
    Zio, Enrico
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART O-JOURNAL OF RISK AND RELIABILITY, 2024, 238 (05) : 903 - +
  • [38] Using natural language processing and machine learning to identify breast cancer local recurrence
    Zeng, Zexian
    Espino, Sasa
    Roy, Ankita
    Li, Xiaoyu
    Khan, Seema A.
    Clare, Susan E.
    Jiang, Xia
    Neapolitan, Richard
    Luo, Yuan
    BMC BIOINFORMATICS, 2018, 19
  • [39] Using natural language processing and machine learning to identify breast cancer local recurrence
    Zexian Zeng
    Sasa Espino
    Ankita Roy
    Xiaoyu Li
    Seema A. Khan
    Susan E. Clare
    Xia Jiang
    Richard Neapolitan
    Yuan Luo
    BMC Bioinformatics, 19
  • [40] Natural Language Processing of Radiology Text Reports: Interactive Text Classification
    Wiggins, Walter F.
    Kitamura, Felipe
    Santos, Igor
    Prevedello, Luciano M.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (04)