Forecasting daily COVID-19 cases with gradient boosted regression trees and other methods: evidence from US cities

被引:0
|
作者
Sen, Anindya [1 ]
Stevens, Nathaniel T. [2 ]
Tran, N. Ken [3 ]
Agarwal, Rishav R. [4 ]
Zhang, Qihuang [5 ]
Dubin, Joel A. [2 ,3 ]
机构
[1] Univ Waterloo, Dept Econ, Waterloo, ON, Canada
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON, Canada
[3] Univ Waterloo, Sch Publ Hlth Sci, Waterloo, ON, Canada
[4] Univ Waterloo, Cheriton Sch Comp Sci, Waterloo, ON, Canada
[5] McGill Univ, Dept Epidemiol Biostat & Occupat Hlth, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
daily COVID-19 cases; epidemiological surveillance; Metropolitan Statistical Areas; Gradient Boosted Regression Trees; Seasonal Autoregressive Integrated Moving Average (SARIMA); Susceptible; Infectious; or Recovered (SIR); Linear Mixed Effects;
D O I
10.3389/fpubh.2023.1259410
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
IntroductionThere is a vast literature on the performance of different short-term forecasting models for country specific COVID-19 cases, but much less research with respect to city level cases. This paper employs daily case counts for 25 Metropolitan Statistical Areas (MSAs) in the U.S. to evaluate the efficacy of a variety of statistical forecasting models with respect to 7 and 28-day ahead predictions.MethodsThis study employed Gradient Boosted Regression Trees (GBRT), Linear Mixed Effects (LME), Susceptible, Infectious, or Recovered (SIR), and Seasonal Autoregressive Integrated Moving Average (SARIMA) models to generate daily forecasts of COVID-19 cases from November 2020 to March 2021.ResultsConsistent with other research that have employed Machine Learning (ML) based methods, we find that Median Absolute Percentage Error (MAPE) values for both 7-day ahead and 28-day ahead predictions from GBRTs are lower than corresponding values from SIR, Linear Mixed Effects (LME), and Seasonal Autoregressive Integrated Moving Average (SARIMA) specifications for the majority of MSAs during November-December 2020 and January 2021. GBRT and SARIMA models do not offer high-quality predictions for February 2021. However, SARIMA generated MAPE values for 28-day ahead predictions are slightly lower than corresponding GBRT estimates for March 2021.DiscussionThe results of this research demonstrate that basic ML models can lead to relatively accurate forecasts at the local level, which is important for resource allocation decisions and epidemiological surveillance by policymakers.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Modeling of daily confirmed Saudi COVID-19 cases using inverted exponential regression
    Al-Dawsari, Sarah R.
    Sultan, Khalaf S.
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (03) : 2303 - 2330
  • [22] COVID-19 and informal work: Evidence from 11 cities
    Chen, Martha Alter
    Grapsa, Erofili
    Ismail, Ghida
    Rogan, Michael
    Valdivia, Marcela
    Alfers, Laura
    Harvey, Jenna
    Ogando, Ana Carolina
    Reed, Sarah Orleans
    Roever, Sally
    INTERNATIONAL LABOUR REVIEW, 2022, 161 (01) : 29 - 58
  • [23] Meteorological Normalisation Using Boosted Regression Trees to Estimate the Impact of COVID-19 Restrictions on Air Quality Levels
    Ceballos-Santos, Sandra
    Gonzalez-Pardo, Jaime
    Carslaw, David C.
    Santurtun, Ana
    Santibanez, Miguel
    Fernandez-Olmo, Ignacio
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (24)
  • [24] A Novel βSA Ensemble Model for Forecasting the Number of Confirmed COVID-19 Cases in the US
    Shih, Dong-Her
    Wu, Ting-Wei
    Shih, Ming-Hung
    Yang, Min-Jui
    Yen, David C.
    MATHEMATICS, 2022, 10 (05)
  • [25] Socioeconomic gradient in COVID-19 vaccination: evidence from Israel
    Saban, Mor
    Myers, Vicki
    Ben-Shetrit, Shani
    Wilf-Miron, Rachel
    INTERNATIONAL JOURNAL FOR EQUITY IN HEALTH, 2021, 20 (01)
  • [26] Socioeconomic gradient in COVID-19 vaccination: evidence from Israel
    Mor Saban
    Vicki Myers
    Shani Ben-Shetrit
    Rachel Wilf-Miron
    International Journal for Equity in Health, 20
  • [27] The Effects of COVID-19 Infection on Opposition to COVID-19 Policies: Evidence from the US Congress
    Dickson, Zachary P.
    Yildirim, Tevfik Murat
    POLITICAL COMMUNICATION, 2025, 42 (01) : 127 - 150
  • [28] A Boosted Evolutionary Neural Architecture Search for Time Series Forecasting with Application to South African COVID-19 Cases
    Akinola, Solomon Oluwole
    Wang, Qing-Guo
    Olukanmi, Peter
    Marwala, Tshilidzi
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (14) : 107 - 130
  • [29] The higher temperature and ultraviolet, the lower COVID-19 prevalence-meta- regression of data from large US cities
    Takagi, Hisato
    Kuno, Toshiki
    Yokoyama, Yujiro
    Ueyama, Hiroki
    Matsushiro, Takuya
    Hari, Yosuke
    Ando, Tomo
    AMERICAN JOURNAL OF INFECTION CONTROL, 2020, 48 (10) : 1281 - 1285
  • [30] Welfare costs of COVID-19: Evidence from US counties
    Yilmazkuday, Hakan
    JOURNAL OF REGIONAL SCIENCE, 2021, 61 (04) : 826 - 848