MultiThal-classifier, a machine learning-based multi-class model for thalassemia diagnosis and classification

被引:0
|
作者
Wang, Wenqiang [1 ]
Ye, Renqing [1 ]
Tang, Baojia [1 ]
Qi, Yuying [1 ]
机构
[1] Ningde Normal Univ, Dept Clin Lab, Ningde Municipal Hosp, 13 Mindong Rd East,Dongqiao Econ & Technol Dev Zon, Ningde 352100, Fujian, Peoples R China
关键词
Thalassemia; Iron Deficiency Anemia; Machine Learning; Multi-Class Model; Hematological Parameters; IRON-DEFICIENCY;
D O I
10.1016/j.cca.2024.120025
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: The differential diagnosis between iron deficiency anemia (IDA) and thalassemia trait (TT) remains a significant clinical challenge. This study aimed to develop a machine learning-based multi-class model to differentiate among Microcytic-TT(TT with low mean corpuscular volume), Normocytic-TT (TT with normal mean corpuscular volume), IDA, and healthy individuals. Methods: A comprehensive dataset comprising 1,819 individuals was analyzed using six distinct machine learning algorithms. The eXtreme Gradient Boosting (XGBoost) algorithm was ultimately selected to construct the MultiThal-Classifier (M-THAL) model. SMOTENC (Synthetic Minority Over-sampling Technique for Nominal and Continuous features) was employed to address data imbalance. Model performance was evaluated using various metrics, and SHAP values were applied to interpret the model's predictions.Additionally, external validation was conducted to assess the model's robustness and generalizability. Results: After performing 1000 bootstrap resamples of the test set, the average performance metrics of M-THAL and the 95 % confidence interval(CI) were as follows, sensitivity 90.27 % (95 % CI: 84.88-95.26), specificity 97.87 % (95% CI: 97.10-98.55), PPV 93.42 % (95 % CI: 89.34-96.48), NPV 97.82% (95 % CI: 97.00-98.53), F1score 91.50 % (95% CI: 87.29-95.34), Youden's index 88.15 % (95 % CI: 82.33-93.70), accuracy 97.06 % (95% CI: 96.06-97.99), and AUC 94.07 % (95 % CI: 91.17-96.84).Feature importance analysis identified mean corpuscular volume(MCV), mean corpuscular hemoglobin(MCH), red cell distribution width - standard deviation(RDW-SD), and hemoglobin (HGB) were identified as the most important features. External validation confirmed the model's robustness and generalizability. Conclusion: The M-THAL effectively distinguishes Normocytic-TT, Microcytic-TT, IDA, and healthy individuals using hematological parameters, offers a rapid and cost-effective screening tool that can be readily implemented in diverse healthcare settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] On the Machine Learning-based Multi-class Classification of Microscopic Colitis
    Tara, Vivek
    Mitra, Dipankar
    Muduganti, Aditi
    Mali, Padmavathi
    Maiti, Srabana
    Dey, Shuvashis
    Gomes, Rahul
    2024 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY, EIT 2024, 2024, : 38 - 43
  • [2] An active learning-based SVM multi-class classification model
    Guo, Husheng
    Wang, Wenjian
    PATTERN RECOGNITION, 2015, 48 (05) : 1577 - 1597
  • [3] Multi-class support vector machine classifier in EMG diagnosis
    Kaur, Gurmanik
    Arora, Ajat Shatru
    Jain, V.K.
    WSEAS Transactions on Signal Processing, 2009, 5 (12): : 379 - 389
  • [4] Joint Binary Classifier Learning for ECOC-Based Multi-Class Classification
    Liu, Mingxia
    Zhang, Daoqiang
    Chen, Songcan
    Xue, Hui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (11) : 2335 - 2341
  • [5] Machine Learning-Based Framework for Multi-Class Diagnosis of Neurodegenerative Diseases: A Study on Parkinson's Disease
    Singh, Gurpreet
    Vadera, Meet
    Samavedham, Lakshminarayanan
    Lim, Erle Chuen-Hian
    IFAC PAPERSONLINE, 2016, 49 (07): : 990 - 995
  • [6] Deep Learning-Based Multi-Class Classification of Breast Digital Pathology Images
    Mi, Weiming
    Li, Junjie
    Guo, Yucheng
    Ren, Xinyu
    Liang, Zhiyong
    Zhang, Tao
    Zou, Hao
    CANCER MANAGEMENT AND RESEARCH, 2021, 13 : 4605 - 4617
  • [7] ECG Multi-Class Classification using Neural Network as Machine Learning Model
    Lassoued, Hela
    Ketata, Raouf
    2018 INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND ELECTRICAL TECHNOLOGIES (IC_ASET), 2017, : 473 - 478
  • [8] Deep learning-based image classification for online multi-coal and multi-class sorting
    Liu, Yang
    Zhang, Zelin
    Liu, Xiang
    Wang, Lei
    Xia, Xuhui
    COMPUTERS & GEOSCIENCES, 2021, 157
  • [9] Faecal microbiome-based machine learning for multi-class disease diagnosis
    Qi Su
    Qin Liu
    Raphaela Iris Lau
    Jingwan Zhang
    Zhilu Xu
    Yun Kit Yeoh
    Thomas W. H. Leung
    Whitney Tang
    Lin Zhang
    Jessie Q. Y. Liang
    Yuk Kam Yau
    Jiaying Zheng
    Chengyu Liu
    Mengjing Zhang
    Chun Pan Cheung
    Jessica Y. L. Ching
    Hein M. Tun
    Jun Yu
    Francis K. L. Chan
    Siew C. Ng
    Nature Communications, 13
  • [10] Extreme Learning Machine for Multi-class Sentiment Classification of Tweets
    Wang, Zhaoxia
    Parth, Yogesh
    PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 1 - 11