Logistic regression was as good as machine learning for predicting major chronic diseases

被引:257
|
作者
Nusinovici, Simon [1 ]
Tham, Yih Chung [1 ,3 ]
Yan, Marco Yu Chak [1 ]
Ting, Daniel Shu Wei [1 ,3 ]
Li, Jialiang [1 ,4 ]
Sabanayagam, Charumathi [1 ,3 ]
Wong, Tien Yin [1 ,2 ,3 ]
Cheng, Ching-Yu [1 ,2 ,3 ]
机构
[1] Singapore Natl Eye Ctr, Singapore Eye Res Inst, Singapore, Singapore
[2] Natl Univ Singapore, Yong Loo Lin Sch Med, Dept Ophthalmol, Singapore, Singapore
[3] Duke NUS Med Sch, Ophthalmol & Visual Sci Acad Clin Programme, Singapore, Singapore
[4] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
基金
英国医学研究理事会;
关键词
Machine learning; Logistic regression; Prognostic modeling; Chronic diseases; Interaction; Nonlinearity; SINGAPORE MALAY EYE; CONVENTIONAL REGRESSION; CARDIOVASCULAR-DISEASE; RISK PREDICTION; METHODOLOGY; CLASSIFICATION; RATIONALE; PROGNOSIS; MORTALITY; DIAGNOSIS;
D O I
10.1016/j.jclinepi.2020.03.002
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. Study Design and Setting: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. Results: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. Conclusion: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
  • [31] Predicting ipsilateral recurrence in women treated for ductal carcinoma in situ using machine learning and multivariable logistic regression models
    Lamb, Leslie R.
    Mercaldo, Sarah
    Kim, Geunwon
    Hovis, Keegan
    Oseni, Tawakalitu O.
    Bahl, Manisha
    CLINICAL IMAGING, 2022, 92 : 94 - 100
  • [32] Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis
    Song, Xuan
    Liu, Xinyan
    Liu, Fei
    Wang, Chunting
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 151
  • [33] HRLR-LOGISTIC: A Factor Selection Machine Learning Method Coupled with Binary Logistic Regression
    Xie, Haoyan
    Sadiq, Maryam
    Huang, Hai
    Sarwar, Sughra
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [34] An In-memory Architecture for Machine Learning Classifier using Logistic Regression
    Saragada, Prasanna Kumar
    Rathod, Meghnath
    Das, Bishnu Prasad
    2019 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2019), 2019, : 209 - 214
  • [35] Machine learning versus logistic regression for the prediction of complications after pancreatoduodenectomy
    Ingwersen, Erik W.
    Stam, Wessel T.
    Meijs, Bono J. V.
    Roor, Joran
    Besselink, Marc G.
    Koerkamp, Bas Groot
    de Hingh, Ignace H. J. T.
    van Santvoort, Hjalmar C.
    Stommel, Martijn W. J.
    Daams, Freek
    SURGERY, 2023, 174 (03) : 435 - 440
  • [36] Heart Disease Prediction Using Logistic Regression Machine Learning Model
    Hrvat, Faris
    Spahic, Lemana
    Aleta, Amina
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 654 - 662
  • [37] Loan Repayment Prediction Using Logistic Regression Ensemble Learning With Machine Learning Algorithms
    Dinh, Thuan Nguyen
    Thanh, Binh Pham
    2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 79 - 85
  • [38] Does Good ESG Lead to Better Financial Performances by Firms? Machine Learning and Logistic Regression Models of Public Enterprises in Europe
    De Lucia, Caterina
    Pazienza, Pasquale
    Bartlett, Mark
    SUSTAINABILITY, 2020, 12 (13)
  • [39] Machine learning methods for predicting major types of rheumatic heart diseases in children of Southern Punjab, Pakistan
    Shahid, Sana
    Khurram, Haris
    Billah, Baki
    Akbar, Atif
    Shehzad, Muhammad Ahmed
    Shabbir, Muhammad Farhan
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [40] Predicting Infectious Diseases by Using Machine Learning Classifiers
    Gomez-Pulido, Juan A.
    Romero-Muelas, Jose M.
    Gomez-Pulido, Jose M.
    Castillo Sequera, Jose L.
    Sanz Moreno, Jose
    Polo-Luque, Maria-Luz
    Garces-Jimenez, Alberto
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2020), 2020, 12108 : 590 - 599