Dual-stage explainable ensemble learning model for diabetes diagnosis

被引:0
|
作者
Elgendy, Ibrahim A. [1 ]
Hosny, Mohamed [1 ]
Albashrawi, Mousa Ahmad [1 ]
Alsenan, Shrooq [2 ]
机构
[1] King Fahd Univ Petr & Minerals, KFUPM Business Sch, IRC Finance & Digital Econ, Dhahran 31261, Saudi Arabia
[2] Princess Nourah bint Abdulrahman Univ, Coll Comp & Informat Sci, Informat Syst Dept, Riyadh 11671, Saudi Arabia
关键词
Diabetes diagnosis; Ensemble learning; Explainable artificial intelligence; Autoencoder; Healthcare;
D O I
10.1016/j.eswa.2025.126899
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Early diagnosis of diabetes is crucial for effective management and prevention of complications. However, traditional diagnostic methods are often constrained by the complexity of clinical datasets. To this end, this study proposes a novel explainable machine learning (ML) framework to enhance diabetes prediction. Specifically, the developed methodology involves the detection of outliers using local outlier factor and data reconstruction through a sparse autoencoder. Subsequently, multiple imputation strategies are employed to effectively address missing or erroneous data, while the synthetic minority oversampling technique is applied to mitigate class imbalance. Afterward, a stacking ensemble model, consisting of seven base ML models, is developed for classification, and the outputs of these base models are aggregated using four meta models. To enhance interpretability, two layers of model explainability are implemented. Feature importance analysis is conducted to identify the significance of input variables and Shapley additive explanations is employed to assess the contribution of each base model to the meta model predictions. The results demonstrated that replacing missing data with zeros or mean values led to a noticeable decrease inaccuracy compared to Knearest neighbor imputation or removing samples. Notably, hypertension and kidney failure are pivotal features in the diabetes diagnosis process. Among the base models, Extra Trees model had the most significant impact on the meta model decisions. The stacking multi-layer perceptron model achieved the highest accuracy of 92.54% for diabetes detection, surpassing the performance of standalone ML techniques. This approach enhances diagnostic precision and provides transparency in model predictions, essential for clinical applications.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Selector: PSO as Model Selector for Dual-Stage Diabetes Network
    Cheruku, Ramalingaswamy
    Edla, Damodar Reddy
    JOURNAL OF INTELLIGENT SYSTEMS, 2020, 29 (01) : 475 - 484
  • [2] DEE: Dual-Stage Explainable Evaluation Method for Text Generation
    Zhang, Shenyu
    Li, Yu
    Wu, Rui
    Huang, Xiutian
    Chen, Yongrui
    Xu, Wenhao
    Qi, Guilin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT VII, DASFAA 2024, 2024, 14856 : 390 - 401
  • [3] An Explainable Deep Learning Ensemble Model for Robust Diagnosis of Diabetic Retinopathy Grading
    Shorfuzzaman, Mohammad
    Hossain, M. Shamim
    El Saddik, Abdulmotaleb
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (03)
  • [4] Hierarchical Boosting Dual-Stage Feature Reduction Ensemble Model for Parkinson's Disease Speech Data
    Yang, Mingyao
    Ma, Jie
    Wang, Pin
    Huang, Zhiyong
    Li, Yongming
    Liu, He
    Hameed, Zeeshan
    DIAGNOSTICS, 2021, 11 (12)
  • [5] Dual-stage airbags
    Automotive Engineering (Warrendale, Pennsylvania), 2000, 108 (03):
  • [6] Using an Ensemble Machine Learning Model with Explainable AI (XAI) to Diagnose Gestational Diabetes Mellitus
    Pasha, S. T.
    Sikder, S.
    Rahman, M. M.
    Islam, A.
    Alam, M. Z.
    Habib, M. T.
    Amin, M. A.
    DIABETES RESEARCH AND CLINICAL PRACTICE, 2024, 209
  • [7] Dual-stage manifold preserving mixed supervised learning for bogie fault diagnosis under variable conditions
    Wang, Ning
    Jia, Limin
    Qin, Yong
    Yao, Dechen
    Yang, Jianwei
    Wang, Zhipeng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 149
  • [8] A Deep Learning Model of Dual-Stage License Plate Recognition Applicable to the Data Processing Industry
    Tung, Chun-Liang
    Wang, Ching-Hsin
    Peng, Bo-Syuan
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021 (2021)
  • [9] Dual-Stage Phase Unwrapping
    Barabadi, Bardia
    Gara, Matthew
    Jooya, Ali
    Baniasadi, Amirali
    Dimopoulos, Nikitas
    2019 IEEE NORDIC CIRCUITS AND SYSTEMS CONFERENCE (NORCAS) - NORCHIP AND INTERNATIONAL SYMPOSIUM OF SYSTEM-ON-CHIP (SOC), 2019,
  • [10] Dual-stage theoretical model of magnetorheological dampers and experimental verification
    Lei, Bingyue
    Li, Jiahao
    Zhou, Wei
    Shou, Mengjie
    Zhao, Feng
    Liao, Changrong
    SMART MATERIALS AND STRUCTURES, 2024, 33 (04)