Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults

被引:19
|
作者
Wu, Yang [1 ,2 ,3 ]
Hu, Haofei [3 ,4 ,5 ]
Cai, Jinlin [1 ,2 ,6 ]
Chen, Runtian [1 ,2 ,3 ]
Zuo, Xin [7 ]
Cheng, Heng [7 ]
Yan, Dewen [1 ,2 ,3 ]
机构
[1] Shenzhen Univ, Affiliated Hosp 1, Dept Endocrinol, Shenzhen, Peoples R China
[2] Shenzhen Second Peoples Hosp, Dept Endocrinol, Shenzhen, Peoples R China
[3] Shenzhen Univ, Hlth Sci Ctr, Shenzhen, Peoples R China
[4] Shenzhen Univ, Affiliated Hosp 1, Dept Nephrol, Shenzhen, Peoples R China
[5] Shenzhen Second Peoples Hosp, Dept Nephrol, Shenzhen, Peoples R China
[6] Shantou Univ, Med Coll, Shantou, Peoples R China
[7] Third Peoples Hosp Shenzhen, Dept Endocrinol, Shenzhen, Peoples R China
关键词
machine learning; extreme gradient boosting; simple stepwise model; Incident diabetes; risk; TYPE-2; MELLITUS; MODELS; COMPLICATIONS; NOMOGRAM; TRENDS; IMPACT; BMI;
D O I
10.3389/fpubh.2021.626331
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Applying latent class analysis to risk stratification of incident diabetes among Chinese adults
    Wu, Yang
    Hu, Haofei
    Cai, Jinlin
    Chen, Runtian
    Zuo, Xin
    Cheng, Heng
    Yan, Dewen
    DIABETES RESEARCH AND CLINICAL PRACTICE, 2021, 174
  • [32] Predicting youth diabetes risk using NHANES data and machine learning
    Vangeepuram, Nita
    Liu, Bian
    Chiu, Po-Hsiang
    Wang, Linhua
    Pandey, Gaurav
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [33] Predicting youth diabetes risk using NHANES data and machine learning
    Nita Vangeepuram
    Bian Liu
    Po-hsiang Chiu
    Linhua Wang
    Gaurav Pandey
    Scientific Reports, 11
  • [34] Machine learning algorithms for predicting the risk of fracture in patients with diabetes in China
    Chu, Sijia
    Jiang, Aijun
    Chen, Lyuzhou
    Zhang, Xi
    Shen, Xiurong
    Zhou, Wan
    Ye, Shandong
    Chen, Chao
    Zhang, Shilu
    Zhang, Li
    Chen, Yang
    Miao, Ya
    Wang, Wei
    HELIYON, 2023, 9 (07)
  • [35] Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
    Dong, Zheyi
    Wang, Qian
    Ke, Yujing
    Zhang, Weiguang
    Hong, Quan
    Liu, Chao
    Liu, Xiaomin
    Yang, Jian
    Xi, Yue
    Shi, Jinlong
    Zhang, Li
    Zheng, Ying
    Lv, Qiang
    Wang, Yong
    Wu, Jie
    Sun, Xuefeng
    Cai, Guangyan
    Qiao, Shen
    Yin, Chengliang
    Su, Shibin
    Chen, Xiangmei
    JOURNAL OF TRANSLATIONAL MEDICINE, 2022, 20 (01)
  • [36] Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
    Zheyi Dong
    Qian Wang
    Yujing Ke
    Weiguang Zhang
    Quan Hong
    Chao Liu
    Xiaomin Liu
    Jian Yang
    Yue Xi
    Jinlong Shi
    Li Zhang
    Ying Zheng
    Qiang Lv
    Yong Wang
    Jie Wu
    Xuefeng Sun
    Guangyan Cai
    Shen Qiao
    Chengliang Yin
    Shibin Su
    Xiangmei Chen
    Journal of Translational Medicine, 20
  • [37] Predicting Freeway Incident Duration Using Machine Learning
    Hamad, Khaled
    Khalil, Mohamad Ali
    Alozi, Abdul Razak
    INTERNATIONAL JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH, 2020, 18 (02) : 367 - 380
  • [38] A nomogram model for predicting 5-year risk of prediabetes in Chinese adults
    Yanhua Hu
    Yong Han
    Yufei Liu
    Yanan Cui
    Zhiping Ni
    Ling Wei
    Changchun Cao
    Haofei Hu
    Yongcheng He
    Scientific Reports, 13
  • [39] A nomogram model for predicting 5-year risk of prediabetes in Chinese adults
    Hu, Yanhua
    Han, Yong
    Liu, Yufei
    Cui, Yanan
    Ni, Zhiping
    Wei, Ling
    Cao, Changchun
    Hu, Haofei
    He, Yongcheng
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [40] Predicting Freeway Incident Duration Using Machine Learning
    Khaled Hamad
    Mohamad Ali Khalil
    Abdul Razak Alozi
    International Journal of Intelligent Transportation Systems Research, 2020, 18 : 367 - 380