MultiThal-classifier, a machine learning-based multi-class model for thalassemia diagnosis and classification

被引:0
|
作者
Wang, Wenqiang [1 ]
Ye, Renqing [1 ]
Tang, Baojia [1 ]
Qi, Yuying [1 ]
机构
[1] Ningde Normal Univ, Dept Clin Lab, Ningde Municipal Hosp, 13 Mindong Rd East,Dongqiao Econ & Technol Dev Zon, Ningde 352100, Fujian, Peoples R China
关键词
Thalassemia; Iron Deficiency Anemia; Machine Learning; Multi-Class Model; Hematological Parameters; IRON-DEFICIENCY;
D O I
10.1016/j.cca.2024.120025
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: The differential diagnosis between iron deficiency anemia (IDA) and thalassemia trait (TT) remains a significant clinical challenge. This study aimed to develop a machine learning-based multi-class model to differentiate among Microcytic-TT(TT with low mean corpuscular volume), Normocytic-TT (TT with normal mean corpuscular volume), IDA, and healthy individuals. Methods: A comprehensive dataset comprising 1,819 individuals was analyzed using six distinct machine learning algorithms. The eXtreme Gradient Boosting (XGBoost) algorithm was ultimately selected to construct the MultiThal-Classifier (M-THAL) model. SMOTENC (Synthetic Minority Over-sampling Technique for Nominal and Continuous features) was employed to address data imbalance. Model performance was evaluated using various metrics, and SHAP values were applied to interpret the model's predictions.Additionally, external validation was conducted to assess the model's robustness and generalizability. Results: After performing 1000 bootstrap resamples of the test set, the average performance metrics of M-THAL and the 95 % confidence interval(CI) were as follows, sensitivity 90.27 % (95 % CI: 84.88-95.26), specificity 97.87 % (95% CI: 97.10-98.55), PPV 93.42 % (95 % CI: 89.34-96.48), NPV 97.82% (95 % CI: 97.00-98.53), F1score 91.50 % (95% CI: 87.29-95.34), Youden's index 88.15 % (95 % CI: 82.33-93.70), accuracy 97.06 % (95% CI: 96.06-97.99), and AUC 94.07 % (95 % CI: 91.17-96.84).Feature importance analysis identified mean corpuscular volume(MCV), mean corpuscular hemoglobin(MCH), red cell distribution width - standard deviation(RDW-SD), and hemoglobin (HGB) were identified as the most important features. External validation confirmed the model's robustness and generalizability. Conclusion: The M-THAL effectively distinguishes Normocytic-TT, Microcytic-TT, IDA, and healthy individuals using hematological parameters, offers a rapid and cost-effective screening tool that can be readily implemented in diverse healthcare settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Bearing Fault Classification Using Multi-Class Machine Learning (ML) Techniques
    Sujatha, C.
    Mohan, Aravind
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (01)
  • [42] Enhancing Ocular Healthcare: Deep Learning-Based Multi-Class Diabetic Eye Disease Segmentation and Classification
    Vadduri, Maneesha
    Kuppusamy, P.
    IEEE ACCESS, 2023, 11 : 137881 - 137898
  • [43] Deep Learning-based Multi-Class COVID-19 Classification with X-ray Images
    Fan, Zong
    He, Shenghua
    Ruan, Su
    Wang, Xiaowei
    Li, Hua
    MEDICAL IMAGING 2021: IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, 2021, 11598
  • [44] Support Vector Machine Based Fast Multi-Class Classification Method
    Song, Zhao-Qing
    Chen, Yao
    Guo, Zhen-Kai
    Zhang, Yuan
    INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND AUTOMATION (ICCEA 2014), 2014, : 1 - 7
  • [45] Multi-class classification of air targets based on support vector machine
    Song, Nai-Hua
    Xing, Qing-Hua
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2006, 28 (08): : 1279 - 1281
  • [46] Multi-Class Classification of Turkish Texts with Machine Learning Algorithms<bold> </bold>
    Gurcan, Fatih
    2018 2ND INTERNATIONAL SYMPOSIUM ON MULTIDISCIPLINARY STUDIES AND INNOVATIVE TECHNOLOGIES (ISMSIT), 2018, : 294 - 298
  • [47] Machine learning with automatic feature selection for multi-class protein fold classification
    Huang, CD
    Liang, SF
    Lin, CT
    Wu, RC
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2005, 21 (04) : 711 - 720
  • [48] Voting base Online Sequential Extreme Learning Machine for Multi-class Classification
    Cao, Jiuwen
    Lin, Zhiping
    Huang, Guang-Bin
    2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 2327 - 2330
  • [49] Multi-Class Text Classification of Uzbek News Articles using Machine Learning
    Rabbimov, I. M.
    Kobilov, S. S.
    IV INTERNATIONAL SCIENTIFIC AND TECHNICAL CONFERENCE MECHANICAL SCIENCE AND TECHNOLOGY UPDATE (MSTU-2020), 2020, 1546
  • [50] Combining Active Learning and Semi-Supervised Learning Based on Extreme Learning Machine for Multi-class Image Classification
    Liu, Jinhua
    Yu, Hualong
    Yang, Wankou
    Sun, Changyin
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: IMAGE AND VIDEO DATA ENGINEERING, ISCIDE 2015, PT I, 2015, 9242 : 163 - 175