Machine Learning for Predicting the 3-Year Risk of Incident Diabetes in Chinese Adults

被引：19

作者：

Wu, Yang ^{[1
,2
,3
]}

Hu, Haofei ^{[3
,4
,5
]}

Cai, Jinlin ^{[1
,2
,6
]}

Chen, Runtian ^{[1
,2
,3
]}

Zuo, Xin ^{[7
]}

Cheng, Heng ^{[7
]}

Yan, Dewen ^{[1
,2
,3
]}

机构：

[1] Shenzhen Univ, Affiliated Hosp 1, Dept Endocrinol, Shenzhen, Peoples R China

[2] Shenzhen Second Peoples Hosp, Dept Endocrinol, Shenzhen, Peoples R China

[3] Shenzhen Univ, Hlth Sci Ctr, Shenzhen, Peoples R China

[4] Shenzhen Univ, Affiliated Hosp 1, Dept Nephrol, Shenzhen, Peoples R China

[5] Shenzhen Second Peoples Hosp, Dept Nephrol, Shenzhen, Peoples R China

[6] Shantou Univ, Med Coll, Shantou, Peoples R China

[7] Third Peoples Hosp Shenzhen, Dept Endocrinol, Shenzhen, Peoples R China

来源：

FRONTIERS IN PUBLIC HEALTH | 2021年 / 9卷

关键词：

machine learning; extreme gradient boosting; simple stepwise model; Incident diabetes; risk; TYPE-2; MELLITUS; MODELS; COMPLICATIONS; NOMOGRAM; TRENDS; IMPACT; BMI;

D O I：

10.3389/fpubh.2021.626331

中图分类号：

R1 [预防医学、卫生学];

学科分类号：

1004 ; 120402 ;

摘要：

Purpose: We aimed to establish and validate a risk assessment system that combines demographic and clinical variables to predict the 3-year risk of incident diabetes in Chinese adults. Methods: A 3-year cohort study was performed on 15,928 Chinese adults without diabetes at baseline. All participants were randomly divided into a training set (n = 7,940) and a validation set (n = 7,988). XGBoost method is an effective machine learning technique used to select the most important variables from candidate variables. And we further established a stepwise model based on the predictors chosen by the XGBoost model. The area under the receiver operating characteristic curve (AUC), decision curve and calibration analysis were used to assess discrimination, clinical use and calibration of the model, respectively. The external validation was performed on a cohort of 11,113 Japanese participants. Result: In the training and validation sets, 148 and 145 incident diabetes cases occurred. XGBoost methods selected the 10 most important variables from 15 candidate variables. Fasting plasma glucose (FPG), body mass index (BMI) and age were the top 3 important variables. And we further established a stepwise model and a prediction nomogram. The AUCs of the stepwise model were 0.933 and 0.910 in the training and validation sets, respectively. The Hosmer-Lemeshow test showed a perfect fit between the predicted diabetes risk and the observed diabetes risk (p = 0.068 for the training set, p = 0.165 for the validation set). Decision curve analysis presented the clinical use of the stepwise model and there was a wide range of alternative threshold probability spectrum. And there were almost no the interactions between these predictors (most P-values for interaction >0.05). Furthermore, the AUC for the external validation set was 0.830, and the Hosmer-Lemeshow test for the external validation set showed no statistically significant difference between the predicted diabetes risk and observed diabetes risk (P = 0.824). Conclusion: We established and validated a risk assessment system for characterizing the 3-year risk of incident diabetes.

引用

页数：12

共 50 条

[11] Machine Learning Algorithm Identifies Patients at Risk for Pancreatic Cancer in a 3-Year Timeframe
Zhu, W.
Pochapin, M.
Yindalon, A.
Razavian, N.
Gonda, T.
PANCREAS, 2021, 50 (07) : 1114 - 1115
[12] MACHINE LEARNING ALGORITHM IDENTIFIES PATIENTS AT RISK FOR PANCREATIC CANCER IN A 3-YEAR TIMEFRAME
Zhu, Weicheng
Pochapin, Mark B.
Aphinyanaphongs, Yindalon
Kastrinos, Fay
Razavian, Narges
Gonda, Tamas A.
GASTROENTEROLOGY, 2022, 162 (07) : S188 - S188
[13] Predicting Chinese bond risk premium with machine learning
Zhai, Jia
Xi, Jiahui
Wen, Conghua
Zong, Lu
EUROPEAN JOURNAL OF FINANCE, 2024,
[14] Predicting 3-Year Incident Mobility Disability in Middle-Aged and Older Adults Using Physical Performance Tests
Deshpande, Nandini
Metter, E. Jeffrey
Guralnik, Jack
Bandinelli, Stefania
Ferrucci, Luigi
ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 2013, 94 (05): : 994 - 997
[15] Changes in sleep duration and 3-year risk of mild cognitive impairment in Chinese older adults
Zhu, Qi
Fan, Hui
Zhang, Xiaoning
Ji, Chao
Xia, Yang
AGING-US, 2020, 12 (01): : 309 - 317
[16] Machine learning for the prediction of atherosclerotic cardiovascular disease during 3-year follow up in Chinese type 2 diabetes mellitus patients
Ding, Jinru
Luo, Yingying
Shi, Huwei
Chen, Ruiyao
Luo, Shuqing
Yang, Xu
Xiao, Zhongzhou
Liang, Bilin
Yan, Qiujuan
Xu, Jie
Ji, Linong
JOURNAL OF DIABETES INVESTIGATION, 2023, 14 (11) : 1289 - 1302
[17] Predicting 3-year all-cause mortality in patients undergoing hemodialysis using machine learning
Okubo, Aiko
Doi, Toshiki
Morii, Kenichi
Nishizawa, Yoshiko
Yamashita, Kazuomi
Shigemoto, Kenichiro
Mizuiri, Sonoo
Arakawa, Tetsuji
Arita, Michiko
Naito, Takayuki
Masaki, Takao
JOURNAL OF NEPHROLOGY, 2025,
[18] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
Faezeh Gorgan-Mohammadi
Taher Rajaee
Mohammad Zounemat-Kermani
Environmental Science and Pollution Research, 2023, 30 : 63839 - 63863
[19] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
Gorgan-Mohammadi, Faezeh
Rajaee, Taher
Zounemat-Kermani, Mohammad
ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2023, 30 (23) : 63839 - 63863
[20] Identification of patients at risk for pancreatic cancer in a 3-year timeframe based on machine learning algorithms
Zhu, Weicheng
Chen, Long
Aphinyanaphongs, Yindalon
Kastrinos, Fay
Simeone, Diane M.
Pochapin, Mark
Stender, Cody
Razavian, Narges
Gonda, Tamas A.
SCIENTIFIC REPORTS, 2025, 15 (01):

← 1 2 3 4 5 →