Predictive models for diabetes mellitus using machine learning techniques

被引:108
|
作者
Lai, Hang [1 ,2 ]
Huang, Huaxiong [1 ,2 ]
Keshavjee, Karim [2 ,3 ]
Guergachi, Aziz [1 ,2 ,4 ]
Gao, Xin [1 ,2 ]
机构
[1] York Univ, Dept Math & Stat, 4700 Keele St, Toronto, ON M3J 1P3, Canada
[2] Ctr Quantitat Anal & Modelling CQAM Lab, Fields Inst Res Math Sci, 222 Coll St, Toronto, ON M5T 3J1, Canada
[3] Univ Toronto, Inst Hlth Policy Management & Evaluat, 155 Coll St,Suite 425, Toronto, ON M5T 3M6, Canada
[4] Ryerson Univ, Ted Rogers Sch Management Informat Technol Manage, 350 Victoria St, Toronto, ON M5B 2K3, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Diabetes mellitus; Machine learning; Gradient boosting machine; Predictive models; Misclassification cost; RISK; PERFORMANCE; ADULTS; SCORE;
D O I
10.1186/s12902-019-0436-6
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Diabetes Mellitus is an increasingly prevalent chronic disease characterized by the body's inability to metabolize glucose. The objective of this study was to build an effective predictive model with high sensitivity and selectivity to better identify Canadian patients at risk of having Diabetes Mellitus based on patient demographic data and the laboratory results during their visits to medical facilities. Methods Using the most recent records of 13,309 Canadian patients aged between 18 and 90 years, along with their laboratory information (age, sex, fasting blood glucose, body mass index, high-density lipoprotein, triglycerides, blood pressure, and low-density lipoprotein), we built predictive models using Logistic Regression and Gradient Boosting Machine (GBM) techniques. The area under the receiver operating characteristic curve (AROC) was used to evaluate the discriminatory capability of these models. We used the adjusted threshold method and the class weight method to improve sensitivity - the proportion of Diabetes Mellitus patients correctly predicted by the model. We also compared these models to other learning machine techniques such as Decision Tree and Random Forest. Results The AROC for the proposed GBM model is 84.7% with a sensitivity of 71.6% and the AROC for the proposed Logistic Regression model is 84.0% with a sensitivity of 73.4%. The GBM and Logistic Regression models perform better than the Random Forest and Decision Tree models. Conclusions The ability of our model to predict patients with Diabetes using some commonly used lab results is high with satisfactory sensitivity. These models can be built into an online computer program to help physicians in predicting patients with future occurrence of diabetes and providing necessary preventive interventions. The model is developed and validated on the Canadian population which is more specific and powerful to apply on Canadian patients than existing models developed from US or other populations. Fasting blood glucose, body mass index, high-density lipoprotein, and triglycerides were the most important predictors in these models.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Predictive models for diabetes mellitus using machine learning techniques
    Hang Lai
    Huaxiong Huang
    Karim Keshavjee
    Aziz Guergachi
    Xin Gao
    BMC Endocrine Disorders, 19
  • [2] Predictive Supervised Machine Learning Models for Diabetes Mellitus
    Muhammad L.J.
    Algehyne E.A.
    Usman S.S.
    SN Computer Science, 2020, 1 (5)
  • [3] A Predictive Model for Diabetes Mellitus Using Machine Learning Techniques (A Study in Nigeria)
    Evwiekpaefe, Abraham Eseoghene
    Abdulkadir, Nafisat
    AFRICAN JOURNAL OF INFORMATION SYSTEMS, 2023, 15 (01):
  • [4] Analysis of Diabetes mellitus using Machine Learning Techniques
    Bhat, Salliah Shafi
    Selvam, Venkatesan
    Ansari, Gufran Ahmad
    Ansari, Mohd Dilshad
    2022 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2022,
  • [5] Predictive models for charitable giving using machine learning techniques
    Farrokhvar, Leily
    Ansari, Azadeh
    Kamali, Behrooz
    PLOS ONE, 2018, 13 (10):
  • [6] Predicting Diabetes Mellitus With Machine Learning Techniques
    Zou, Quan
    Qu, Kaiyang
    Luo, Yamei
    Yin, Dehui
    Ju, Ying
    Tang, Hua
    FRONTIERS IN GENETICS, 2018, 9
  • [7] Metabolic Syndrome and Development of Diabetes Mellitus: Predictive Modeling Based on Machine Learning Techniques
    Perveen, Sajida
    Shahbaz, Muhammad
    Keshavjee, Karim
    Guergachi, Aziz
    IEEE ACCESS, 2019, 7 : 1365 - 1375
  • [8] DIAGNOSIS OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES FOR EFFICIENT REVIEW
    Thiyagarajan, C.
    Vaideghy, A.
    Sridevi, V
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (02) : 4184 - 4187
  • [9] Diabetes Mellitus Disease Prediction and Type Classification Involving Predictive Modeling Using Machine Learning Techniques and Classifiers
    Ahamed, B. Shamreen
    Arya, Meenakshi S.
    Sangeetha, S. K. B.
    Auxilia Osvin, Nancy V.
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2022, 2022
  • [10] Evaluation of predisposing factors of Diabetes Mellitus post Gestational Diabetes Mellitus using Machine Learning Techniques
    Krishnan, Devi R.
    Menakath, Gayathri P.
    Radhakrishnan, Anagha
    Himavarshini, Yarrangangu
    Aparna, A.
    Mukundan, Kaveri
    Pathinarupothi, Rahul Krishnan
    Alangot, Bithin
    Mahankali, Sirisha
    Maddipati, Chakravarthy
    2019 17TH IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED), 2019, : 81 - 85