Logistic regression was as good as machine learning for predicting major chronic diseases

被引:257
|
作者
Nusinovici, Simon [1 ]
Tham, Yih Chung [1 ,3 ]
Yan, Marco Yu Chak [1 ]
Ting, Daniel Shu Wei [1 ,3 ]
Li, Jialiang [1 ,4 ]
Sabanayagam, Charumathi [1 ,3 ]
Wong, Tien Yin [1 ,2 ,3 ]
Cheng, Ching-Yu [1 ,2 ,3 ]
机构
[1] Singapore Natl Eye Ctr, Singapore Eye Res Inst, Singapore, Singapore
[2] Natl Univ Singapore, Yong Loo Lin Sch Med, Dept Ophthalmol, Singapore, Singapore
[3] Duke NUS Med Sch, Ophthalmol & Visual Sci Acad Clin Programme, Singapore, Singapore
[4] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore, Singapore
基金
英国医学研究理事会;
关键词
Machine learning; Logistic regression; Prognostic modeling; Chronic diseases; Interaction; Nonlinearity; SINGAPORE MALAY EYE; CONVENTIONAL REGRESSION; CARDIOVASCULAR-DISEASE; RISK PREDICTION; METHODOLOGY; CLASSIFICATION; RATIONALE; PROGNOSIS; MORTALITY; DIAGNOSIS;
D O I
10.1016/j.jclinepi.2020.03.002
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objective: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. Study Design and Setting: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. Results: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. Conclusion: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
  • [41] Predicting corporate financial distress based on integration of support vector machine and logistic regression
    Hua, Zhongsheng
    Wang, Yu
    Xu, Xiaoyan
    Zhang, Bin
    Liang, Liang
    EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (02) : 434 - 440
  • [42] Logistic regression technique is comparable to complex machine learning algorithms in predicting cognitive impairment related to post intensive care syndrome
    TingTing Wu
    YueQing Wei
    JingBing Wu
    BiLan Yi
    Hong Li
    Scientific Reports, 13
  • [43] Logistic regression technique is comparable to complex machine learning algorithms in predicting cognitive impairment related to post intensive care syndrome
    Wu, TingTing
    Wei, YueQing
    Wu, JingBing
    Yi, BiLan
    Li, Hong
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [44] Machine learning versus multivariate logistic regression for predicting severe COVID-19 in hospitalized children with Omicron variant infection
    Liu, Pan
    Xing, Zixuan
    Peng, Xiaokang
    Zhang, Mengyi
    Shu, Chang
    Wang, Ce
    Li, Ruina
    Tang, Li
    Wei, Huijing
    Ran, Xiaoshan
    Qiu, Sikai
    Gao, Ning
    Yeo, Yee Hui
    Liu, Xiaoguai
    Ji, Fanpu
    JOURNAL OF MEDICAL VIROLOGY, 2024, 96 (02)
  • [45] Investigation of expert rule bases, logistic regression, and non-linear machine learning techniques for predicting response to antiretroviral treatment
    Prosperi, Mattia C. F.
    Altmann, Andre
    Rosen-Zvi, Michal
    Aharoni, Ehud
    Gabor Borgulya
    Fulop Bazso
    Sonnerborg, Anders
    Schuelter, Eugen
    Struck, Daniel
    Ulivi, Giovanni
    Vandamme, Anne-Mieke
    Vercauteren, Jurgen
    Zazzi, Maurizio
    ANTIVIRAL THERAPY, 2009, 14 (03) : 433 - 442
  • [46] Predicting synkinesis caused by Bell's palsy or Ramsay Hunt syndrome using machine learning-based logistic regression
    Kishimoto-Urata, Megumi
    Urata, Shinji
    Nishijima, Hironobu
    Baba, Shintaro
    Fujimaki, Yoko
    Kondo, Kenji
    Yamasoba, Tatsuya
    LARYNGOSCOPE INVESTIGATIVE OTOLARYNGOLOGY, 2023, 8 (05): : 1189 - 1195
  • [47] MACHINE LEARNING OUTPERFORMS LOGISTIC REGRESSION IN PREDICTING ACCURACY OF CCU ADMISSION FOR HIGH GRADE SEROUS ADVANCED OVARIAN CANCER PATIENTS
    Laios, A.
    De Oliveira, R. V.
    Lucas, D.
    Tan, Y.
    Saalmink, G.
    Zubayraeva, A.
    Thangavelu, A.
    Hutson, R.
    Broadhead, T.
    Nugent, D.
    Theophilou, G.
    Gomes De Lima, K. M.
    Dejong, D.
    INTERNATIONAL JOURNAL OF GYNECOLOGICAL CANCER, 2021, 31 : A179 - A179
  • [48] Predicting the Energetic Proton Flux with a Machine Learning Regression Algorithm
    Stumpo, Mirko
    Laurenza, Monica
    Benella, Simone
    Marcucci, Maria Federica
    ASTROPHYSICAL JOURNAL, 2024, 975 (01):
  • [49] Performance of Machine Learning Compared With Regression Analysis in Predicting Albuminuria
    Arsiwala, Ali Haider
    Mathew, Marcella
    Scheppach, Johannes B.
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2022, 33 (11): : 301 - 301
  • [50] Chronic stress in practice assistants: An analytic approach comparing four machine learning classifiers with a standard logistic regression model
    Bozorgmehr, Arezoo
    Thielmann, Anika
    Weltermann, Birgitta
    PLOS ONE, 2021, 16 (05):