Logistic regression was as good as machine learning for predicting major chronic diseases

被引:257
|
作者
Nusinovici, Simon [1 ]
Tham, Yih Chung [1 ,3 ]
Yan, Marco Yu Chak [1 ]
Ting, Daniel Shu Wei [1 ,3 ]
Li, Jialiang [1 ,4 ]
Sabanayagam, Charumathi [1 ,3 ]
Wong, Tien Yin [1 ,2 ,3 ]
Cheng, Ching-Yu [1 ,2 ,3 ]
机构
[1] Singapore Natl Eye Ctr, Singapore Eye Res Inst, Singapore, Singapore
[2] Natl Univ Singapore, Yong Loo Lin Sch Med, Dept Ophthalmol, Singapore, Singapore
[3] Duke NUS Med Sch, Ophthalmol & Visual Sci Acad Clin Programme, Singapore, Singapore
[4] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
基金
英国医学研究理事会;
关键词
Machine learning; Logistic regression; Prognostic modeling; Chronic diseases; Interaction; Nonlinearity; SINGAPORE MALAY EYE; CONVENTIONAL REGRESSION; CARDIOVASCULAR-DISEASE; RISK PREDICTION; METHODOLOGY; CLASSIFICATION; RATIONALE; PROGNOSIS; MORTALITY; DIAGNOSIS;
D O I
10.1016/j.jclinepi.2020.03.002
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. Study Design and Setting: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. Results: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. Conclusion: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
  • [21] Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
    Khurshid, Faiza
    Coo, Helen
    Khalil, Amal
    Messiha, Jonathan
    Ting, Joseph Y.
    Wong, Jonathan
    Shah, Prakesh S.
    FRONTIERS IN PEDIATRICS, 2021, 9
  • [22] Machine Learning Techniques for Predicting Heart Diseases
    Taha, Mohammed A.
    Alsaidi, Saif Ali Abd Alradha
    Hussein, Reem Ali
    2022 INTERNATIONAL SYMPOSIUM ON INNOVATIVE INFORMATICS OF BISKRA, ISNIB, 2022, : 123 - 128
  • [23] Predicting Meditation Practices Among Individuals With Cardiovascular Diseases: A Logistic Regression Analysis
    Lu, Junfei
    Ford, Cassandra D.
    Vaughans, Doris
    REHABILITATION PSYCHOLOGY, 2025, 70 (01) : 104 - 109
  • [24] Comorbidity and multimorbidity prediction of major chronic diseases using machine learning and network analytics
    Uddin, Shahadat
    Wang, Shangzhou
    Lu, Haohui
    Khan, Arif
    Hajati, Farshid
    Khushi, Matloob
    Expert Systems with Applications, 2022, 205
  • [25] Comorbidity and multimorbidity prediction of major chronic diseases using machine learning and network analytics
    Uddin, Shahadat
    Wang, Shangzhou
    Lu, Haohui
    Khan, Arif
    Hajati, Farshid
    Khushi, Matloob
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 205
  • [26] Machine Learning Regression Model for Predicting Honey Harvests
    Campbell, Tristan
    Dixon, Kingsley W.
    Dods, Kenneth
    Fearns, Peter
    Handcock, Rebecca
    AGRICULTURE-BASEL, 2020, 10 (04):
  • [27] Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty
    Yong-Hao Pua
    Hakmook Kang
    Julian Thumboo
    Ross Allan Clark
    Eleanor Shu-Xian Chew
    Cheryl Lian-Li Poon
    Hwei-Chi Chong
    Seng-Jin Yeo
    Knee Surgery, Sports Traumatology, Arthroscopy, 2020, 28 : 3207 - 3216
  • [28] Predicting Passivhaus certification of dwellings using machine learning: A comparative analysis of logistic regression and gradient boosting decision trees
    Du, Yusheng
    Gou, Zhonghua
    JOURNAL OF BUILDING ENGINEERING, 2023, 79
  • [29] Predicting Overweight and Obesity Status Among Malaysian Working Adults With Machine Learning or Logistic Regression: Retrospective Comparison Study
    Wong, Jyh Eiin
    Yamaguchi, Miwa
    Nishi, Nobuo
    Araki, Michihiro
    Wee, Lei Hum
    JMIR FORMATIVE RESEARCH, 2022, 6 (12)
  • [30] Machine learning methods are comparable to logistic regression techniques in predicting severe walking limitation following total knee arthroplasty
    Pua, Yong-Hao
    Kang, Hakmook
    Thumboo, Julian
    Clark, Ross Allan
    Chew, Eleanor Shu-Xian
    Poon, Cheryl Lian-Li
    Chong, Hwei-Chi
    Yeo, Seng-Jin
    KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2020, 28 (10) : 3207 - 3216