Forecasting PM2.5 concentration levels using shallow machine learning models on the Monterrey Metropolitan Area in Mexico

被引:3
|
作者
Pozo-Luyo, Cesar Alejandro [1 ]
Cruz-Duarte, Jorge M. [1 ]
Amaya, Ivan [1 ]
Ortiz-Bayliss, Jose Carlos [1 ]
机构
[1] Tecnol Monterrey, Sch Engn & Sci, Ave Eugenio Garza Sada 2501, Monterrey 64700, Nuevo Leon, Mexico
关键词
Air quality forecasting; PM2.5; forecasting; Machine learning; Regression; METEOROLOGICAL CONDITIONS; AIR-QUALITY; EXPOSURE;
D O I
10.1016/j.apr.2023.101898
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Monterrey Metropolitan Area is one of the most densely populated and polluted regions in Latin America. Hence, providing early warnings to the population when pollutant concentrations reach high levels is critical. This allows people at higher health risk to make informed decisions about when to go out, mitigating future health complications. Using forecasting models, we can produce timely warnings for future concentration levels. In this work, we implement a set of short-term shallow machine learning models that would serve as a baseline for future forecasting analyses of PM2.5 concentration levels in the Monterrey Metropolitan Area. The proposed approach starts with multiple imputation through chained equations for missing value imputation, the incorporation of time metadata, and target winsorization. Then, we rely on the well-known random search for parameter optimization of the machine learning models and k-fold cross-validation, obtaining favorable results. We devise these models for a single-step and single-station analysis on an hourly multivariate air quality dataset (containing 77203 rows and 16 columns from the first hour of January 1, 2015 00:00:00 to April 17, 2022 23:00:00) and compare them using standard regression metrics. Therefore, we identify the forecasting model with the best performance, which was an Extra Trees Regressor with a Root Mean Squared Error of 0.013, a Mean Absolute Error of 0.006 (equivalent to a Mean Absolute Percentage Error of 0.294% and a Symmetric Mean Absolute Percentage Error of 0.078%), and a Maximum Error of 0.187 mu g/m(3).
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Forecasting PM2.5 levels in Santiago de Chile using deep learning neural networks
    Menares, Camilo
    Perez, Patricio
    Parraguez, Santiago
    Fleming, Zoe L.
    URBAN CLIMATE, 2021, 38
  • [32] PM2.5 forecasting for an urban area based on deep learning and decomposition method
    Zaini, Nur'atiah
    Ean, Lee Woen
    Ahmed, Ali Najah
    Malek, Marlinda Abdul
    Chow, Ming Fai
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [33] Evaluation of Time Series Forecasting Models for Estimation of PM2.5 Levels in Air
    Garg, Satvik
    Jindal, Himanshu
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [34] The stable carbon isotope composition of PM2.5 and PM10 in Mexico City Metropolitan Area air
    Lopez-Veneroni, D.
    ATMOSPHERIC ENVIRONMENT, 2009, 43 (29) : 4491 - 4502
  • [35] PM2.5 forecasting for an urban area based on deep learning and decomposition method
    Nur’atiah Zaini
    Lee Woen Ean
    Ali Najah Ahmed
    Marlinda Abdul Malek
    Ming Fai Chow
    Scientific Reports, 12
  • [36] Evaluation of Different Machine Learning Approaches to Forecasting PM2.5 Mass Concentrations
    Karimian, Hamed
    Li, Qi
    Wu, Chunlin
    Qi, Yanlin
    Mo, Yuqin
    Chen, Gong
    Zhang, Xianfeng
    Sachdeva, Sonali
    AEROSOL AND AIR QUALITY RESEARCH, 2019, 19 (06) : 1400 - 1410
  • [37] A machine learning-based model to estimate PM2.5 concentration levels in Delhi's atmosphere
    Kumar, Saurabh
    Mishra, Shweta
    Singh, Sunil Kumar
    HELIYON, 2020, 6 (11)
  • [38] Commuters' exposure to PM2.5, CO, and benzene in public transport in the metropolitan area of Mexico City
    Gómez-Perales, JE
    Colvile, RN
    Nieuwenhuijsen, MJ
    Fernández-Bremauntz, A
    Gutiérrez-Avedoy, VJ
    Páramo-Figueroa, VH
    Blanco-Jiménez, S
    Bueno-López, E
    Mandujano, F
    Bernabé-Cabanillas, R
    Ortiz-Segovia, E
    ATMOSPHERIC ENVIRONMENT, 2004, 38 (08) : 1219 - 1229
  • [39] A nested machine learning approach to short-term PM2.5 prediction in metropolitan areas using PM2.5 data from different sensor networks
    Li, Jing
    Crooks, James
    Murdock, Jennifer
    de Souza, Priyanka
    Hohs, Kirk
    Obermann, Bill
    Stockman, Tehya
    SCIENCE OF THE TOTAL ENVIRONMENT, 2023, 873
  • [40] PM2.5 concentration prediction using machine learning algorithms: an approach to virtual monitoring stations
    Makhdoomi, Ahmad
    Sarkhosh, Maryam
    Ziaei, Somayyeh
    SCIENTIFIC REPORTS, 2025, 15 (01):