An Improved Air Quality Index Machine Learning-Based Forecasting with Multivariate Data Imputation Approach

被引:23
|
作者
Alkabbani, Hanin [1 ]
Ramadan, Ashraf [2 ]
Zhu, Qinqin [1 ]
Elkamel, Ali [1 ]
机构
[1] Univ Waterloo, Dept Chem Engn, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
[2] Kuwait Inst Sci Res, Environm & Life Sci Res Ctr, Environm Pollut & Climate Program, POB 24885, Safat 13109, Kuwait
基金
加拿大自然科学与工程研究理事会;
关键词
ambient air quality observations; AQI; artificial neural network; machine learning; missForest imputation; forecasting; ARTIFICIAL NEURAL-NETWORKS; HYBRID ARIMA; PREDICTION; FINE; POLLUTION; MODEL; PARTICLES; MORTALITY; ENERGY; SAND;
D O I
10.3390/atmos13071144
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Accurate, timely air quality index (AQI) forecasting helps industries in selecting the most suitable air pollution control measures and the public in reducing harmful exposure to pollution. This article proposes a comprehensive method to forecast AQIs. Initially, the work focused on predicting hourly ambient concentrations of PM2.5 and PM10 using artificial neural networks. Once the method was developed, the work was extended to the prediction of other criteria pollutants, i.e., O-3,O- SO2, NO2, and CO, which fed into the process of estimating AQI. The prediction of the AQI not only requires the selection of a robust forecasting model, it also heavily relies on a sequence of pre-processing steps to select predictors and handle different issues in data, including gaps. The presented method dealt with this by imputing missing entries using missForest, a machine learning-based imputation technique which employed the random forest (RF) algorithm. Unlike the usual practice of using RF at the final forecasting stage, we utilized RF at the data pre-processing stage, i.e., missing data imputation and feature selection, and we obtained promising results. The effectiveness of this imputation method was examined against a linear imputation method for the six criteria pollutants and the AQI. The proposed approach was validated against ambient air quality observations for Al-Jahra, a major city in Kuwait. Results obtained showed that models trained using missForest-imputed data could generalize AQI forecasting and with a prediction accuracy of 92.41% when tested on new unseen data, which is better than earlier findings.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] MACHINE LEARNING APPROACH TO PREDICT AND COMPARE THE AIR QUALITY INDEX IN A CONFINED ENVIRONMENT
    Kumar, Sampath harish
    Kanish, Thorapadi chandrasekaran
    ENVIRONMENT PROTECTION ENGINEERING, 2024, 50 (04): : 5 - 27
  • [42] Analysis of Machine Learning Based Imputation of Missing Data
    Rizvi, Syed Tahir Hussain
    Latif, Muhammad Yasir
    Amin, Muhammad Saad
    Telmoudi, Achraf Jabeur
    Shah, Nasir Ali
    CYBERNETICS AND SYSTEMS, 2023,
  • [43] A comparative study of traditional machine learning and hybrid fuzzy inference system machine learning models for air quality index forecasting
    Ordenshiya, K. M.
    Revathi, Gk
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2025,
  • [44] Water-Quality Data Imputation with a High Percentage of Missing Values: A Machine Learning Approach
    Rodriguez, Rafael
    Pastorini, Marcos
    Etcheverry, Lorena
    Chreties, Christian
    Fossati, Monica
    Castro, Alberto
    Gorgoglione, Angela
    SUSTAINABILITY, 2021, 13 (11)
  • [45] Machine Learning-Based Demand Forecasting in Supply Chains
    Carbonneau, Real
    Vahidov, Rustam
    Laframboise, Kevin
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2007, 3 (04) : 40 - 57
  • [46] Machine Learning-Based Demand Forecasting for an FMCG Retailer
    Ceran, Berkan
    Ozkan, Ece
    Eskiocak, Defne Idil
    Mert, Buse
    Yuceoglu, Birol
    INTELLIGENT AND FUZZY SYSTEMS, VOL 3, INFUS 2024, 2024, 1090 : 85 - 91
  • [47] Enhancing drought monitoring with a multivariate hydrometeorological index and machine learning-based prediction in the south of Iran
    Hossein Zamani
    Zohreh Pakdaman
    Marzieh Shakari
    Ommolbanin Bazrafshan
    Sajad Jamshidi
    Environmental Science and Pollution Research, 2025, 32 (9) : 5605 - 5627
  • [48] Air Quality Index and Air Pollutant Concentration Prediction Based on Machine Learning Algorithms
    Liu, Huixiang
    Li, Qing
    Yu, Dongbing
    Gu, Yu
    APPLIED SCIENCES-BASEL, 2019, 9 (19):
  • [49] A probabilistic forecasting approach for air quality spatio-temporal data based on kernel learning method
    Zhan, Haolin
    Zhu, Xin
    Hu, Jianming
    APPLIED SOFT COMPUTING, 2023, 132
  • [50] Development of Machine Learning-based Predictive Models for Air Quality Monitoring and Characterization
    Amado, Timothy M.
    Dela Cruz, Jennifer C.
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 0668 - 0672