Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods

被引:13
|
作者
Feng, Cindy [1 ]
Kephart, George [1 ]
Juarez-Colunga, Elizabeth [2 ]
机构
[1] Dalhousie Univ, Dept Community Hlth & Epidemiol, Fac Med, 5790 Univ Ave, Halifax, NS B3H 1V7, Canada
[2] Univ Colorado, Dept Biostat & Informat, Anschutz Med Campus, Aurora, CO 80045 USA
基金
加拿大自然科学与工程研究理事会;
关键词
COVID-19; mortality; Predictive model; Generalized additive model; Classification trees; Extreme gradient boosting; LOGISTIC-REGRESSION; MODELS;
D O I
10.1186/s12874-021-01441-4
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background Coronavirus disease (COVID-19) presents an unprecedented threat to global health worldwide. Accurately predicting the mortality risk among the infected individuals is crucial for prioritizing medical care and mitigating the healthcare system's burden. The present study aimed to assess the predictive accuracy of machine learning methods to predict the COVID-19 mortality risk. Methods We compared the performance of classification tree, random forest (RF), extreme gradient boosting (XGBoost), logistic regression, generalized additive model (GAM) and linear discriminant analysis (LDA) to predict the mortality risk among 49,216 COVID-19 positive cases in Toronto, Canada, reported from March 1 to December 10, 2020. We used repeated split-sample validation and k-steps-ahead forecasting validation. Predictive models were estimated using training samples, and predictive accuracy of the methods for the testing samples was assessed using the area under the receiver operating characteristic curve, Brier's score, calibration intercept and calibration slope. Results We found XGBoost is highly discriminative, with an AUC of 0.9669 and has superior performance over conventional tree-based methods, i.e., classification tree or RF methods for predicting COVID-19 mortality risk. Regression-based methods (logistic, GAM and LASSO) had comparable performance to the XGBoost with slightly lower AUCs and higher Brier's scores. Conclusions XGBoost offers superior performance over conventional tree-based methods and minor improvement over regression-based methods for predicting COVID-19 mortality risk in the study population.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Predicting Mortality Risk in Older Hospitalized Persons With COVID-19: A Comparison of the COVID-19 Mortality Risk Score with Frailty and Disability
    Fumagalli, Carlo
    Ungar, Andrea
    Rozzini, Renzo
    Vannini, Matteo
    Coccia, Flaminia
    Cesaroni, Giulia
    Mazzeo, Francesca
    D'Ettore, Nicoletta
    Zocchi, Chiara
    Tassetti, Luigi
    Bartoloni, Alessandro
    Lavorini, Federico
    Marcucci, Rossella
    Olivotto, Iacopo
    Rasero, Laura
    Fattirolli, Francesco
    Fumagalli, Stefano
    Marchionni, Niccolo
    JOURNAL OF THE AMERICAN MEDICAL DIRECTORS ASSOCIATION, 2021, 22 (08) : 1588 - +
  • [32] Machine Learning Algorithms are Superior to Conventional Regression Models in Predicting Risk Stratification of COVID-19 Patients
    Ye, Jiru
    Hua, Meng
    Zhu, Feng
    RISK MANAGEMENT AND HEALTHCARE POLICY, 2021, 14 : 3159 - 3166
  • [33] Machine learning based early warning system enables accurate mortality risk prediction for COVID-19
    Gao, Yue
    Cai, Guang-Yao
    Fang, Wei
    Li, Hua-Yi
    Wang, Si-Yuan
    Chen, Lingxi
    Yu, Yang
    Liu, Dan
    Xu, Sen
    Cui, Peng-Fei
    Zeng, Shao-Qing
    Feng, Xin-Xia
    Yu, Rui-Di
    Wang, Ya
    Yuan, Yuan
    Jiao, Xiao-Fei
    Chi, Jian-Hua
    Liu, Jia-Hao
    Li, Ru-Yuan
    Zheng, Xu
    Song, Chun-Yan
    Jin, Ning
    Gong, Wen-Jian
    Liu, Xing-Yu
    Huang, Lei
    Tian, Xun
    Li, Lin
    Xing, Hui
    Ma, Ding
    Li, Chun-Rui
    Ye, Fei
    Gao, Qing-Lei
    NATURE COMMUNICATIONS, 2020, 11 (01)
  • [34] Machine Learning Techniques and Forecasting Methods for Analyzing and Predicting Covid-19
    Alshabeeb, Israa Ali
    Azeez, Ruaa Majeed
    Shakir, Wafaa Mohammed Ridha
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2022, 17 (01): : 413 - 424
  • [35] Machine learning based early warning system enables accurate mortality risk prediction for COVID-19
    Yue Gao
    Guang-Yao Cai
    Wei Fang
    Hua-Yi Li
    Si-Yuan Wang
    Lingxi Chen
    Yang Yu
    Dan Liu
    Sen Xu
    Peng-Fei Cui
    Shao-Qing Zeng
    Xin-Xia Feng
    Rui-Di Yu
    Ya Wang
    Yuan Yuan
    Xiao-Fei Jiao
    Jian-Hua Chi
    Jia-Hao Liu
    Ru-Yuan Li
    Xu Zheng
    Chun-Yan Song
    Ning Jin
    Wen-Jian Gong
    Xing-Yu Liu
    Lei Huang
    Xun Tian
    Lin Li
    Hui Xing
    Ding Ma
    Chun-Rui Li
    Fei Ye
    Qing-Lei Gao
    Nature Communications, 11
  • [36] Mortality Analysis of Patients with COVID-19 in Mexico Based on Risk Factors Applying Machine Learning Techniques
    Becerra-Sanchez, Aldonso
    Rodarte-Rodriguez, Armando
    Escalante-Garcia, Nivia I.
    Olvera-Gonzalez, Jose E.
    De la Rosa-vargas, Jose I.
    Zepeda-Valles, Gustavo
    Velasquez-Martinez, Emmanuel de J.
    DIAGNOSTICS, 2022, 12 (06)
  • [37] A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
    Chowdhury, Mohammad Ziaul Islam
    Leung, Alexander A. A.
    Walker, Robin L. L.
    Sikdar, Khokan C. C.
    O'Beirne, Maeve
    Quan, Hude
    Turin, Tanvir C. C.
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [38] A comparison of machine learning algorithms and traditional regression-based statistical modeling for predicting hypertension incidence in a Canadian population
    Mohammad Ziaul Islam Chowdhury
    Alexander A. Leung
    Robin L. Walker
    Khokan C. Sikdar
    Maeve O’Beirne
    Hude Quan
    Tanvir C. Turin
    Scientific Reports, 13
  • [39] Predicting COVID-19-Induced Lung Damage Based on Machine Learning Methods
    Vasilev, I. A.
    Petrovskiy, M., I
    Mashechkin, I., V
    Pankratyeva, L. L.
    PROGRAMMING AND COMPUTER SOFTWARE, 2022, 48 (04) : 243 - 255
  • [40] Predicting COVID-19-Induced Lung Damage Based on Machine Learning Methods
    I. A. Vasilev
    M. I. Petrovskiy
    I. V. Mashechkin
    L. L. Pankratyeva
    Programming and Computer Software, 2022, 48 : 243 - 255