A Risk Prediction Model for Type 2 Diabetes Based on Weighted Feature Selection of Random Forest and XGBoost Ensemble Classifier

被引:7
|
作者
Xu, Zhongxian [1 ]
Wang, Zhiliang [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China
关键词
diagnosis of diabetes; data mining; weighted feature selection; random forest; extreme gradient boosting;
D O I
10.1109/icaci.2019.8778622
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Type 2 diabetes mellitus is a severe chronic disease threatening human health and has a high incidence worldwide. People need to use effective prediction model to diagnose and prevent diabetes in time. At present, data mining technology has become an increasingly important technology with classification capability in the field of medical diagnosis. This paper proposes a risk prediction model for type 2 diabetes based on ensemble learning method. In the proposed model, the weighted feature selection algorithm based on random forest (RF-WFS) is used for optimal feature selection, and extreme gradient boosting (XGBoost) classifier. The effectiveness of the method was validated by comparing the various performance metrics and the results of different contrast experiments. Additionally, we get a better prediction accuracy using the method than using the other classification algorithms (C4.5, Naive Bayes, AdaBoost, Random Forest). The validation results at CO Pima Indian diabetes dataset shows that the model has better accuracy and classification performance than other research results mentioned in the literature. As a result, it has been proven that the model would be effective for the diagnosis of diabetes at the initial stage.
引用
收藏
页码:278 / 283
页数:6
相关论文
共 50 条
  • [1] Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique
    Gundogdu, Serdar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (22) : 34163 - 34181
  • [2] Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique
    Serdar Gündoğdu
    Multimedia Tools and Applications, 2023, 82 : 34163 - 34181
  • [3] Hidden AS link prediction based on random forest feature selection and GWO-XGBoost model
    Wang, Zekang
    Yuan, Fuxiang
    Li, Ruixiang
    Zhang, Meng
    Luo, Xiangyang
    COMPUTER NETWORKS, 2025, 262
  • [4] Construction of a diagnostic classifier for cervical intraepithelial neoplasia and cervical cancer based on XGBoost feature selection and random forest model
    Zhang, Jing
    Yang, Xiuqing
    Chen, Jia
    Han, Jing
    Chen, Xiaofeng
    Fan, Yueping
    Zheng, Hui
    JOURNAL OF OBSTETRICS AND GYNAECOLOGY RESEARCH, 2023, 49 (01) : 296 - 303
  • [5] Prediction Method of Type 2 Diabetes Mellitus Based on a Combination of Hybrid Feature Selection and Random Forest
    Wang, Yunming
    Hu, Jiangang
    Fan, Xinru
    Gao, Xiue
    Liu, Changzheng
    WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 439 - 450
  • [6] The Risk Prediction of Type 2 Diabetes based on XGBoost
    Ji, Wei
    Lin, Shaofu
    2019 2ND INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC AND ENGINEERING TECHNOLOGY (MEET 2019), 2019, : 145 - 150
  • [7] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Zhou, Hongfang
    Xin, Yinbo
    Li, Suli
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [8] A diabetes prediction model based on Boruta feature selection and ensemble learning
    Hongfang Zhou
    Yinbo Xin
    Suli Li
    BMC Bioinformatics, 24
  • [9] Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model
    Wang, Liyang
    Wang, Xiaoya
    Chen, Angxuan
    Jin, Xian
    Che, Huilian
    HEALTHCARE, 2020, 8 (03)
  • [10] Classifying Model of Ancient Glass Products Based on Ensemble Feature Selection and Random Forest
    Lu J.
    Kuei Suan Jen Hsueh Pao/Journal of the Chinese Ceramic Society, 2023, 51 (04): : 1060 - 1065