An integrated feature selection and machine learning framework for PM10 concentration prediction

被引:0
|
作者
Kalantari, Elham [1 ]
Gholami, Hamid [1 ]
Malakooti, Hossein [2 ]
Kaskaoutis, Dimitris G. [3 ,4 ]
Saneei, Poorya [5 ]
机构
[1] Univ Hormozgan, Dept Nat Resources Engn, Bandar Abbas, Hormozgan, Iran
[2] Univ Hormozgan, Fac Marine Sci & Technol, Dept Marine & Atmospher Sci Non Biol, Bandar Abbas, Iran
[3] Univ Western Macedonia, Dept Chem Engn, Kozani 50100, Greece
[4] Inst Environm Res & Sustainable Dev, Natl Observ Athens, Athens 15236, Greece
[5] Iran Univ Sci & Technol, Dept Comp Engn, Tehran, Iran
关键词
Air pollution; Feature selection; Machine learning; PM10; Dust; Zabol; DUST STORMS; PM2.5; CONCENTRATIONS; PARTICULATE MATTER; RIDGE-REGRESSION; SISTAN REGION; COMPONENT ANALYSIS; POLLUTION; MORTALITY; CANCER; IRAN;
D O I
10.1016/j.apr.2025.102456
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Sistan Basin, east Iran is a major dust source, presenting significant atmospheric, ecological, socio-economic, and health challenges. This study employed machine learning (ML) algorithms, including Random Forest (RF), KNearest Neighbor (KNN), Weighted K-Nearest Neighbor (WKNN), Support Vector Regression (SVR), and Least Absolute Shrinkage and Selection Operator (LASSO), to model and predict PM10 concentrations in Zabol City (2013-2022), utilizing independent meteorological variables such as temperature, relative humidity, wind speed and direction. Feature selection methods - Filter (Information Gain, F-Test, Correlation Coefficient), Wrapper (Recursive Feature Elimination, Sequential Forward/Backward Selection), and Embedded (LASSO, Elastic Net, Ridge Regression, RF Importance) - were applied to identify significant predictors, with embedded methods providing the best balance of simplicity, accuracy, and cost-efficiency. Among the models, RF demonstrated the highest seasonal performance (R2 = 0.75) during summer. RF's prediction R2 values for PM10 remained above 0.5 in all seasons, consistently outperformed the other models. The WKNN model performed reasonably well across all seasons, ranking second among the models, while the LASSO model demonstrated weaker performance. The SVR model showed satisfactory performance in specific seasons, such as summer and autumn. A common feature of all models was their better performance during summer. Importantly, the models relied solely on readily available meteorological data, enabling accurate predictions of PM10 in this arid region of eastern Iran. The findings highlight the potential of ML techniques for developing air pollution prediction and warning systems, offering valuable support to policymakers in the design of effective pollution control strategies and safeguarding public health.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] The Pollution Feature of Air PM10 in Dandong
    梁铁军
    黑龙江环境通报, 2009, (04) : 20 - 22
  • [32] Evaluating traditional versus ensemble machine learning methods for predicting missing data of daily PM10 concentration
    Kalantari, Elham
    Gholami, Hamid
    Malakooti, Hossein
    Eftekhari, Mahdi
    Saneei, Poorya
    Esfandiarpour, Donya
    Moosavi, Vahid
    Nafarzadegan, Ali Reza
    ATMOSPHERIC POLLUTION RESEARCH, 2024, 15 (05)
  • [33] A genetically optimised neural network for prediction of maximum hourly PM10 concentration
    Kapageridis, I
    Triantafyllou, AG
    AIR POLLUTION XII, 2004, 14 : 161 - 170
  • [34] Combined Prediction of PM10 Concentration at Smart Construction Sites Based on Quadratic Mode Decomposition and Deep Learning
    Li, Ming
    Li, Xin
    Kang, Kaikai
    Li, Qiang
    SUSTAINABILITY, 2025, 17 (02)
  • [35] Predicting PM10 and PM2.5 concentration in container ports: A deep learning approach
    Park, So -Young
    Woo, Su-Han
    Lim, Changwon
    TRANSPORTATION RESEARCH PART D-TRANSPORT AND ENVIRONMENT, 2023, 115
  • [36] Estimation of PM10 and PM2.5 Using Backscatter Coefficient of Ceilometer and Machine Learning
    Kim, Bu-Yo
    Cha, Joo Wan
    Lee, Yong Hee
    AEROSOL AND AIR QUALITY RESEARCH, 2023, 23 (12)
  • [37] Network Modeling Of PM10 Concentration in Malaysia
    Abu Supian, Muhammad Nazirul Aiman
    Abu Bakar, Sakhinah
    Razak, Fatimah Abdul
    PROCEEDINGS OF THE 24TH NATIONAL SYMPOSIUM ON MATHEMATICAL SCIENCES (SKSM24): MATHEMATICAL SCIENCES EXPLORATION FOR THE UNIVERSAL PRESERVATION, 2017, 1870
  • [38] PM10 concentration measurements in Dublin City
    Keary, J
    Jennings, SG
    O'Connor, TC
    McManus, B
    Lee, M
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 1998, 52 (1-2) : 3 - 18
  • [39] PM10 Concentration Measurements in Dublin City
    J. Keary
    S. G. Jennings
    T. C. O'Connor
    B. McManus
    M. Lee
    Environmental Monitoring and Assessment, 1998, 52 : 3 - 18
  • [40] An Improved Machine Learning-Based Employees Attrition Prediction Framework with Emphasis on Feature Selection
    Najafi-Zangeneh, Saeed
    Shams-Gharneh, Naser
    Arjomandi-Nezhad, Ali
    Zolfani, Sarfaraz Hashemkhani
    MATHEMATICS, 2021, 9 (11)