An Improved Air Quality Index Machine Learning-Based Forecasting with Multivariate Data Imputation Approach

被引:23
|
作者
Alkabbani, Hanin [1 ]
Ramadan, Ashraf [2 ]
Zhu, Qinqin [1 ]
Elkamel, Ali [1 ]
机构
[1] Univ Waterloo, Dept Chem Engn, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
[2] Kuwait Inst Sci Res, Environm & Life Sci Res Ctr, Environm Pollut & Climate Program, POB 24885, Safat 13109, Kuwait
基金
加拿大自然科学与工程研究理事会;
关键词
ambient air quality observations; AQI; artificial neural network; machine learning; missForest imputation; forecasting; ARTIFICIAL NEURAL-NETWORKS; HYBRID ARIMA; PREDICTION; FINE; POLLUTION; MODEL; PARTICLES; MORTALITY; ENERGY; SAND;
D O I
10.3390/atmos13071144
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Accurate, timely air quality index (AQI) forecasting helps industries in selecting the most suitable air pollution control measures and the public in reducing harmful exposure to pollution. This article proposes a comprehensive method to forecast AQIs. Initially, the work focused on predicting hourly ambient concentrations of PM2.5 and PM10 using artificial neural networks. Once the method was developed, the work was extended to the prediction of other criteria pollutants, i.e., O-3,O- SO2, NO2, and CO, which fed into the process of estimating AQI. The prediction of the AQI not only requires the selection of a robust forecasting model, it also heavily relies on a sequence of pre-processing steps to select predictors and handle different issues in data, including gaps. The presented method dealt with this by imputing missing entries using missForest, a machine learning-based imputation technique which employed the random forest (RF) algorithm. Unlike the usual practice of using RF at the final forecasting stage, we utilized RF at the data pre-processing stage, i.e., missing data imputation and feature selection, and we obtained promising results. The effectiveness of this imputation method was examined against a linear imputation method for the six criteria pollutants and the AQI. The proposed approach was validated against ambient air quality observations for Al-Jahra, a major city in Kuwait. Results obtained showed that models trained using missForest-imputed data could generalize AQI forecasting and with a prediction accuracy of 92.41% when tested on new unseen data, which is better than earlier findings.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Air Quality Forecasting Using Big Data and Machine Learning Algorithms
    Youn-Seo Koo
    Yunsoo Choi
    Chang‐Hoi Ho
    Asia-Pacific Journal of Atmospheric Sciences, 2023, 59 : 529 - 530
  • [22] Machine Learning-Based A Comparative Analysis for Air Quality Prediction
    Utku, Anil
    Can, Umit
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [23] A machine learning-based approach for estimating and testing associations with multivariate outcomes
    Benkeser, David
    Mertens, Andrew
    Colford, John M.
    Hubbard, Alan
    Arnold, Benjamin F.
    Stein, Aryeh
    van der Laan, Mark J.
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2021, 17 (01): : 7 - 21
  • [24] Machine Learning-Based Forecasting of Metocean Data for Offshore Engineering Applications
    Barooni, Mohammad
    Ghaderpour Taleghani, Shiva
    Bahrami, Masoumeh
    Sedigh, Parviz
    Velioglu Sogut, Deniz
    ATMOSPHERE, 2024, 15 (06)
  • [25] A Novel Index Measure Imputation Algorithm for Missing Data Values: A Machine Learning Approach
    Madhu, G.
    Rajinikanth, T. V.
    2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2012, : 81 - 87
  • [26] Machine Learning-Based Smart Home Data Analysis and Forecasting Method
    Park, Sanguk
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,
  • [27] Machine Learning-Based Rainfall Prediction: Unveiling Insights and Forecasting for Improved Preparedness
    Hassan, Md. Mehedi
    Rony, Mohammad Abu Tareq
    Khan, Md. Asif Rakib
    Hassan, Md. Mahedi
    Yasmin, Farhana
    Nag, Anindya
    Zarin, Tazria Helal
    Bairagi, Anupam Kumar
    Alshathri, Samah
    El-Shafai, Walid
    IEEE ACCESS, 2023, 11 : 132196 - 132222
  • [28] Hybrid machine learning system based on multivariate data decomposition and feature selection for improved multitemporal evapotranspiration forecasting
    Lee, Jinwook
    Bateni, Sayed M.
    Jun, Changhyun
    Heggy, Essam
    Jamei, Mehdi
    Kim, Dongkyun
    Ghafouri, Hamid Reza
    Deenik, Jonathan L.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 135
  • [29] Skew Index: a machine learning forecasting approach
    Vanegas, Esteban
    Mora-Valencia, Andres
    RISK MANAGEMENT-AN INTERNATIONAL JOURNAL, 2025, 27 (01):
  • [30] The Optimal Machine Learning-Based Missing Data Imputation for the Cox Proportional Hazard Model
    Guo, Chao-Yu
    Yang, Ying-Chen
    Chen, Yi-Hau
    FRONTIERS IN PUBLIC HEALTH, 2021, 9