A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance

被引：38

作者：

Lu, Hongxia ^{[1
]}

Ehwerhemuepha, Louis ^{[1
,2
]}

Rakovski, Cyril ^{[1
]}

机构：

[1] Chapman Univ, Schmid Coll Sci & Technol, 1 Univ Dr, Orange, CA 92866 USA

[2] CHOC, Orange, CA 92868 USA

来源：

BMC MEDICAL RESEARCH METHODOLOGY | 2022年 / 22卷 / 01期

关键词：

Medical notes; Text classification; BERT; CNN; Deep learning; Embedding; Transformer encoder;

D O I：

10.1186/s12874-022-01665-y

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Background Discharge medical notes written by physicians contain important information about the health condition of patients. Many deep learning algorithms have been successfully applied to extract important information from unstructured medical notes data that can entail subsequent actionable results in the medical domain. This study aims to explore the model performance of various deep learning algorithms in text classification tasks on medical notes with respect to different disease class imbalance scenarios. Methods In this study, we employed seven artificial intelligence models, a CNN (Convolutional Neural Network), a Transformer encoder, a pretrained BERT (Bidirectional Encoder Representations from Transformers), and four typical sequence neural networks models, namely, RNN (Recurrent Neural Network), GRU (Gated Recurrent Unit), LSTM (Long Short-Term Memory), and Bi-LSTM (Bi-directional Long Short-Term Memory) to classify the presence or absence of 16 disease conditions from patients' discharge summary notes. We analyzed this question as a composition of 16 binary separate classification problems. The model performance of the seven models on each of the 16 datasets with various levels of imbalance between classes were compared in terms of AUC-ROC (Area Under the Curve of the Receiver Operating Characteristic), AUC-PR (Area Under the Curve of Precision and Recall), F1 Score, and Balanced Accuracy as well as the training time. The model performances were also compared in combination with different word embedding approaches (GloVe, BioWordVec, and no pre-trained word embeddings). Results The analyses of these 16 binary classification problems showed that the Transformer encoder model performs the best in nearly all scenarios. In addition, when the disease prevalence is close to or greater than 50%, the Convolutional Neural Network model achieved a comparable performance to the Transformer encoder, and its training time was 17.6% shorter than the second fastest model, 91.3% shorter than the Transformer encoder, and 94.7% shorter than the pre-trained BERT-Base model. The BioWordVec embeddings slightly improved the performance of the Bi-LSTM model in most disease prevalence scenarios, while the CNN model performed better without pre-trained word embeddings. In addition, the training time was significantly reduced with the GloVe embeddings for all models. Conclusions For classification tasks on medical notes, Transformer encoders are the best choice if the computation resource is not an issue. Otherwise, when the classes are relatively balanced, CNNs are a leading candidate because of their competitive performance and computational efficiency.

引用

页数：12

共 50 条

[1] A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance
Hongxia Lu
Louis Ehwerhemuepha
Cyril Rakovski
BMC Medical Research Methodology, 22
[2] Unraveling the Impact of Class Imbalance on Deep-Learning Models for Medical Image Classification
Hellin, Carlos J.
Olmedo, Alvaro A.
Valledor, Adrian
Gomez, Josefa
Lopez-Benitez, Miguel
Tayebi, Abdelhamid
APPLIED SCIENCES-BASEL, 2024, 14 (08):
[3] Comparative Analysis of Deep Learning Models for Myanmar Text Classification
Phyu, Myat Sapal
Nwet, Khin Thandar
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 76 - 85
[4] Unstructured Medical Text Classification Using Linguistic Analysis: A Supervised Deep Learning Approach
Al-Doulat, Ahmad
Obaidat, Islam
Lee, Minwoo
2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
[5] A comparative study on various pre-processing techniques and deep learning algorithms for text classification
Bhuvaneshwari P.
Rao A.N.
International Journal of Cloud Computing, 2022, 11 (01): : 61 - 78
[6] A Comparative Study onWord Embeddings in Deep Learning for Text Classification
Wang, Congcong
Nulty, Paul
Lillis, David
2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 37 - 46
[7] Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention
Prabhakar, Sunil Kumar
Won, Dong-Ok
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
[8] A Comparative Text Classification Study with Deep Learning-Based Algorithms
Koksal, Omer
Akgul, Ozlem
2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 387 - 391
[9] Classification Models of Text: A Comparative Study
Zhan, Tiffany
2021 IEEE 11TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2021, : 1221 - 1225
[10] Addressing class imbalance in deep learning for acoustic target classification
Pala, Ahmet
Oleynik, Anna
Utseth, Ingrid
Handegard, Nils Olav
ICES JOURNAL OF MARINE SCIENCE, 2023, 80 (10) : 2530 - 2544

← 1 2 3 4 5 →