Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods

被引:13
|
作者
Feng, Cindy [1 ]
Kephart, George [1 ]
Juarez-Colunga, Elizabeth [2 ]
机构
[1] Dalhousie Univ, Dept Community Hlth & Epidemiol, Fac Med, 5790 Univ Ave, Halifax, NS B3H 1V7, Canada
[2] Univ Colorado, Dept Biostat & Informat, Anschutz Med Campus, Aurora, CO 80045 USA
基金
加拿大自然科学与工程研究理事会;
关键词
COVID-19; mortality; Predictive model; Generalized additive model; Classification trees; Extreme gradient boosting; LOGISTIC-REGRESSION; MODELS;
D O I
10.1186/s12874-021-01441-4
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background Coronavirus disease (COVID-19) presents an unprecedented threat to global health worldwide. Accurately predicting the mortality risk among the infected individuals is crucial for prioritizing medical care and mitigating the healthcare system's burden. The present study aimed to assess the predictive accuracy of machine learning methods to predict the COVID-19 mortality risk. Methods We compared the performance of classification tree, random forest (RF), extreme gradient boosting (XGBoost), logistic regression, generalized additive model (GAM) and linear discriminant analysis (LDA) to predict the mortality risk among 49,216 COVID-19 positive cases in Toronto, Canada, reported from March 1 to December 10, 2020. We used repeated split-sample validation and k-steps-ahead forecasting validation. Predictive models were estimated using training samples, and predictive accuracy of the methods for the testing samples was assessed using the area under the receiver operating characteristic curve, Brier's score, calibration intercept and calibration slope. Results We found XGBoost is highly discriminative, with an AUC of 0.9669 and has superior performance over conventional tree-based methods, i.e., classification tree or RF methods for predicting COVID-19 mortality risk. Regression-based methods (logistic, GAM and LASSO) had comparable performance to the XGBoost with slightly lower AUCs and higher Brier's scores. Conclusions XGBoost offers superior performance over conventional tree-based methods and minor improvement over regression-based methods for predicting COVID-19 mortality risk in the study population.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Comparison of Some Balancing Methods for Classification of Pacing Horses Using Tree-based Machine Learning Algorithms
    Ozen, Hullya
    Ozen, Dogukan
    Yuceer Ozkul, Banu
    Ozbeyaz, Ceyhan
    KAFKAS UNIVERSITESI VETERINER FAKULTESI DERGISI, 2024, 30 (01) : 31 - 40
  • [42] Predicting CoVID-19 community mortality risk using machine learning and development of an online prognostic tool
    Das, Ashis Kumar
    Mishra, Shiba
    Gopalan, Saji Saraswathy
    PEERJ, 2020, 8
  • [43] Tree-based Machine Learning and Deep Learning in Predicting Investor Intention to Public Private Partnership
    Amin, Ahmad
    Rahmawaty
    Lautania, Maya Febrianty
    Masrom, Suraya
    Rahman, Rahayu Abdul
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (01) : 191 - 195
  • [44] Distribution-free risk assessment of regression-based machine learning algorithms
    Singh, Sukrita
    Sarna, Neeraj
    Li, Yuanyuan
    Lin, Yang
    Orfanoudaki, Agni
    Berger, Michael
    13TH SYMPOSIUM ON CONFORMAL AND PROBABILISTIC PREDICTION WITH APPLICATIONS, 2024, 230 : 44 - 64
  • [45] Predicting COVID-19 exposure risk perception using machine learning
    Bakkeli, Nan Zou
    BMC PUBLIC HEALTH, 2023, 23 (01)
  • [46] Predicting COVID-19 exposure risk perception using machine learning
    Nan Zou Bakkeli
    BMC Public Health, 23
  • [47] Boosting Insights in Insurance Tariff Plans with Tree-Based Machine Learning Methods
    Henckaerts, Roel
    Cote, Marie-Pier
    Antonio, Katrien
    Verbelen, Roel
    NORTH AMERICAN ACTUARIAL JOURNAL, 2021, 25 (02) : 255 - 285
  • [48] Use of tree-based machine learning methods to screen affinitive peptides based on docking data
    Feng, Hua
    Wang, Fangyu
    Li, Ning
    Xu, Qian
    Zheng, Guanming
    Sun, Xuefeng
    Hu, Man
    Li, Xuewu
    Xing, Guangxu
    Zhang, Gaiping
    MOLECULAR INFORMATICS, 2023, 42 (12)
  • [49] Predicting Mortality in Hospitalized COVID-19 Patients in Zambia: An Application of Machine Learning
    Mulenga, Clyde
    Kaonga, Patrick
    Hamoonga, Raymond
    Mazaba, Mazyanga Lucy
    Chabala, Freeman
    Musonda, Patrick
    GLOBAL HEALTH EPIDEMIOLOGY AND GENOMICS, 2023, 2023
  • [50] Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19
    Sonu Subudhi
    Ashish Verma
    Ankit B. Patel
    C. Corey Hardin
    Melin J. Khandekar
    Hang Lee
    Dustin McEvoy
    Triantafyllos Stylianopoulos
    Lance L. Munn
    Sayon Dutta
    Rakesh K. Jain
    npj Digital Medicine, 4