Forecasting daily COVID-19 cases with gradient boosted regression trees and other methods: evidence from US cities

被引:0
|
作者
Sen, Anindya [1 ]
Stevens, Nathaniel T. [2 ]
Tran, N. Ken [3 ]
Agarwal, Rishav R. [4 ]
Zhang, Qihuang [5 ]
Dubin, Joel A. [2 ,3 ]
机构
[1] Univ Waterloo, Dept Econ, Waterloo, ON, Canada
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON, Canada
[3] Univ Waterloo, Sch Publ Hlth Sci, Waterloo, ON, Canada
[4] Univ Waterloo, Cheriton Sch Comp Sci, Waterloo, ON, Canada
[5] McGill Univ, Dept Epidemiol Biostat & Occupat Hlth, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
daily COVID-19 cases; epidemiological surveillance; Metropolitan Statistical Areas; Gradient Boosted Regression Trees; Seasonal Autoregressive Integrated Moving Average (SARIMA); Susceptible; Infectious; or Recovered (SIR); Linear Mixed Effects;
D O I
10.3389/fpubh.2023.1259410
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
IntroductionThere is a vast literature on the performance of different short-term forecasting models for country specific COVID-19 cases, but much less research with respect to city level cases. This paper employs daily case counts for 25 Metropolitan Statistical Areas (MSAs) in the U.S. to evaluate the efficacy of a variety of statistical forecasting models with respect to 7 and 28-day ahead predictions.MethodsThis study employed Gradient Boosted Regression Trees (GBRT), Linear Mixed Effects (LME), Susceptible, Infectious, or Recovered (SIR), and Seasonal Autoregressive Integrated Moving Average (SARIMA) models to generate daily forecasts of COVID-19 cases from November 2020 to March 2021.ResultsConsistent with other research that have employed Machine Learning (ML) based methods, we find that Median Absolute Percentage Error (MAPE) values for both 7-day ahead and 28-day ahead predictions from GBRTs are lower than corresponding values from SIR, Linear Mixed Effects (LME), and Seasonal Autoregressive Integrated Moving Average (SARIMA) specifications for the majority of MSAs during November-December 2020 and January 2021. GBRT and SARIMA models do not offer high-quality predictions for February 2021. However, SARIMA generated MAPE values for 28-day ahead predictions are slightly lower than corresponding GBRT estimates for March 2021.DiscussionThe results of this research demonstrate that basic ML models can lead to relatively accurate forecasts at the local level, which is important for resource allocation decisions and epidemiological surveillance by policymakers.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Prediction Model of hospitalization time of COVID-19 patients based on Gradient Boosted Regression Trees
    Zhang, Zhihao
    Zeng, Ting
    Wang, Yijia
    Su, Yinxia
    Tian, Xianghua
    Ma, Guoxiang
    Luan, Zemin
    Li, Fengjun
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (06) : 10444 - 10458
  • [2] Forecasting the Confirmed COVID-19 Cases Using Modal Regression
    Jing, Xin
    Cho, Jin Seo
    JOURNAL OF FORECASTING, 2025,
  • [3] Forecasting and comparative analysis of Covid-19 cases in India and US
    Biswas, Santanu
    EUROPEAN PHYSICAL JOURNAL-SPECIAL TOPICS, 2022, 231 (18-20): : 3537 - 3544
  • [4] Forecasting and comparative analysis of Covid-19 cases in India and US
    Santanu Biswas
    The European Physical Journal Special Topics, 2022, 231 : 3537 - 3544
  • [5] Forecasting COVID-19 daily cases using phone call data
    Rostami-Tabar, Bahman
    Rendon-Sanchez, Juan F.
    APPLIED SOFT COMPUTING, 2021, 100 (100)
  • [6] Forecasting the daily and cumulative number of cases for the COVID-19 pandemic in India
    Khajanchi, Subhas
    Sarkar, Kankan
    CHAOS, 2020, 30 (07)
  • [7] The effects of COVID-19 on downtown land use: Evidence from four US cities
    Hutson, Nathan Mark
    Orlando, Anthony W.
    JOURNAL OF URBAN AFFAIRS, 2025, 47 (02) : 547 - 565
  • [8] Forecasting daily confirmed COVID-19 cases in Malaysia using ARIMA models
    Singh, Sarbhan
    Sundram, Bala Murali
    Rajendran, Kamesh
    Law, Kian Boon
    Aris, Tahir
    Ibrahim, Hishamshah
    Dass, Sarat Chandra
    Gill, Balvinder Singh
    JOURNAL OF INFECTION IN DEVELOPING COUNTRIES, 2020, 14 (09): : 971 - +
  • [9] Forecasting daily confirmed COVID-19 cases in Algeria using ARIMA models
    Abdelaziz, Messis
    Ahmed, Adjebli
    Riad, Ayeche
    Abderrezak, Ghidouche
    Djida, Ait-Ali
    EASTERN MEDITERRANEAN HEALTH JOURNAL, 2023, 29 (07) : 515 - 519
  • [10] Forecasting COVID-19 Total Daily Cases in Indonesia Using LSTM Networks
    Indriyani, Clarissa Angelita
    Wijaya, Claudia Rachel
    Qomariyah, Nunung Nurul
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 385 - 391