Predictive models for diabetes mellitus using machine learning techniques

被引:108
|
作者
Lai, Hang [1 ,2 ]
Huang, Huaxiong [1 ,2 ]
Keshavjee, Karim [2 ,3 ]
Guergachi, Aziz [1 ,2 ,4 ]
Gao, Xin [1 ,2 ]
机构
[1] York Univ, Dept Math & Stat, 4700 Keele St, Toronto, ON M3J 1P3, Canada
[2] Ctr Quantitat Anal & Modelling CQAM Lab, Fields Inst Res Math Sci, 222 Coll St, Toronto, ON M5T 3J1, Canada
[3] Univ Toronto, Inst Hlth Policy Management & Evaluat, 155 Coll St,Suite 425, Toronto, ON M5T 3M6, Canada
[4] Ryerson Univ, Ted Rogers Sch Management Informat Technol Manage, 350 Victoria St, Toronto, ON M5B 2K3, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Diabetes mellitus; Machine learning; Gradient boosting machine; Predictive models; Misclassification cost; RISK; PERFORMANCE; ADULTS; SCORE;
D O I
10.1186/s12902-019-0436-6
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Diabetes Mellitus is an increasingly prevalent chronic disease characterized by the body's inability to metabolize glucose. The objective of this study was to build an effective predictive model with high sensitivity and selectivity to better identify Canadian patients at risk of having Diabetes Mellitus based on patient demographic data and the laboratory results during their visits to medical facilities. Methods Using the most recent records of 13,309 Canadian patients aged between 18 and 90 years, along with their laboratory information (age, sex, fasting blood glucose, body mass index, high-density lipoprotein, triglycerides, blood pressure, and low-density lipoprotein), we built predictive models using Logistic Regression and Gradient Boosting Machine (GBM) techniques. The area under the receiver operating characteristic curve (AROC) was used to evaluate the discriminatory capability of these models. We used the adjusted threshold method and the class weight method to improve sensitivity - the proportion of Diabetes Mellitus patients correctly predicted by the model. We also compared these models to other learning machine techniques such as Decision Tree and Random Forest. Results The AROC for the proposed GBM model is 84.7% with a sensitivity of 71.6% and the AROC for the proposed Logistic Regression model is 84.0% with a sensitivity of 73.4%. The GBM and Logistic Regression models perform better than the Random Forest and Decision Tree models. Conclusions The ability of our model to predict patients with Diabetes using some commonly used lab results is high with satisfactory sensitivity. These models can be built into an online computer program to help physicians in predicting patients with future occurrence of diabetes and providing necessary preventive interventions. The model is developed and validated on the Canadian population which is more specific and powerful to apply on Canadian patients than existing models developed from US or other populations. Fasting blood glucose, body mass index, high-density lipoprotein, and triglycerides were the most important predictors in these models.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Prediction of complications in diabetes mellitus using machine learning models with transplanted topic model features
    Benedict Choonghyun Han
    Jimin Kim
    Jinwook Choi
    Biomedical Engineering Letters, 2024, 14 : 163 - 171
  • [42] Machine Learning Techniques to Improve Predictive Models for Kidney Offer Acceptance
    Martinez, C.
    Nasir, M.
    Kshirsagar, M.
    Shean, R.
    Mccharen, K.
    Stuart, M.
    AMERICAN JOURNAL OF TRANSPLANTATION, 2023, 23 (06) : S736 - S736
  • [43] Diabetes prediction model using machine learning techniques
    Modak, Sandip Kumar Singh
    Jha, Vijay Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 38523 - 38549
  • [44] CLASSIFICATION OF DIABETES USING ENSEMBLE MACHINE LEARNING TECHNIQUES
    Ashisha G.R.
    Mary X.A.
    Raja J.M.
    Scalable Computing, 2024, 25 (04): : 3172 - 3180
  • [45] Diabetes prediction model using machine learning techniques
    Sandip Kumar Singh Modak
    Vijay Kumar Jha
    Multimedia Tools and Applications, 2024, 83 : 38523 - 38549
  • [46] Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques
    Liu, Qing
    Zhang, Miao
    He, Yifeng
    Zhang, Lei
    Zou, Jingui
    Yan, Yaqiong
    Guo, Yan
    JOURNAL OF PERSONALIZED MEDICINE, 2022, 12 (06):
  • [47] Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques
    Mushtaq, Zaigham
    Ramzan, Muhammad Farhan
    Ali, Sikandar
    Baseer, Samad
    Samad, Ali
    Husnain, Mujtaba
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [48] Tongue image fusion and analysis of thermal and visible images in diabetes mellitus using machine learning techniques
    Thirunavukkarasu, Usharani
    Umapathy, Snekhalatha
    Ravi, Vinayakumar
    Alahmadi, Tahani Jaser
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [49] A Predictive Analysis of Heart Rates Using Machine Learning Techniques
    Oyeleye, Matthew
    Chen, Tianhua
    Titarenko, Sofya
    Antoniou, Grigoris
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (04)
  • [50] Predictive Analysis Of Breast Cancer Using Machine Learning Techniques
    Agrawal, Rashmi
    INGENIERIA SOLIDARIA, 2019, 15 (29):