An integrated feature selection and machine learning framework for PM10 concentration prediction

被引:0
|
作者
Kalantari, Elham [1 ]
Gholami, Hamid [1 ]
Malakooti, Hossein [2 ]
Kaskaoutis, Dimitris G. [3 ,4 ]
Saneei, Poorya [5 ]
机构
[1] Univ Hormozgan, Dept Nat Resources Engn, Bandar Abbas, Hormozgan, Iran
[2] Univ Hormozgan, Fac Marine Sci & Technol, Dept Marine & Atmospher Sci Non Biol, Bandar Abbas, Iran
[3] Univ Western Macedonia, Dept Chem Engn, Kozani 50100, Greece
[4] Inst Environm Res & Sustainable Dev, Natl Observ Athens, Athens 15236, Greece
[5] Iran Univ Sci & Technol, Dept Comp Engn, Tehran, Iran
关键词
Air pollution; Feature selection; Machine learning; PM10; Dust; Zabol; DUST STORMS; PM2.5; CONCENTRATIONS; PARTICULATE MATTER; RIDGE-REGRESSION; SISTAN REGION; COMPONENT ANALYSIS; POLLUTION; MORTALITY; CANCER; IRAN;
D O I
10.1016/j.apr.2025.102456
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Sistan Basin, east Iran is a major dust source, presenting significant atmospheric, ecological, socio-economic, and health challenges. This study employed machine learning (ML) algorithms, including Random Forest (RF), KNearest Neighbor (KNN), Weighted K-Nearest Neighbor (WKNN), Support Vector Regression (SVR), and Least Absolute Shrinkage and Selection Operator (LASSO), to model and predict PM10 concentrations in Zabol City (2013-2022), utilizing independent meteorological variables such as temperature, relative humidity, wind speed and direction. Feature selection methods - Filter (Information Gain, F-Test, Correlation Coefficient), Wrapper (Recursive Feature Elimination, Sequential Forward/Backward Selection), and Embedded (LASSO, Elastic Net, Ridge Regression, RF Importance) - were applied to identify significant predictors, with embedded methods providing the best balance of simplicity, accuracy, and cost-efficiency. Among the models, RF demonstrated the highest seasonal performance (R2 = 0.75) during summer. RF's prediction R2 values for PM10 remained above 0.5 in all seasons, consistently outperformed the other models. The WKNN model performed reasonably well across all seasons, ranking second among the models, while the LASSO model demonstrated weaker performance. The SVR model showed satisfactory performance in specific seasons, such as summer and autumn. A common feature of all models was their better performance during summer. Importantly, the models relied solely on readily available meteorological data, enabling accurate predictions of PM10 in this arid region of eastern Iran. The findings highlight the potential of ML techniques for developing air pollution prediction and warning systems, offering valuable support to policymakers in the design of effective pollution control strategies and safeguarding public health.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Spatial prediction of PM10 concentration using machine learning algorithms in Ankara, Turkey
    Bozdag, Asli
    Dokuz, Yesim
    Gokcek, Oznur Begum
    ENVIRONMENTAL POLLUTION, 2020, 263 (263)
  • [2] SHAP explanation of machine learning forecasting of PM10 concentration
    Ko, Byungjun
    Lee, Chaewon
    Kang, Taedong
    Choi, Ji Eun
    KOREAN JOURNAL OF APPLIED STATISTICS, 2025, 38 (01) : 79 - 88
  • [3] Classification Prediction of PM10 Concentration Using a Tree-Based Machine Learning Approach
    Shaziayani, Wan Nur
    Ul-Saufie, Ahmad Zia
    Mutalib, Sofianita
    Noor, Norazian Mohamad
    Zainordin, Nazatul Syadia
    ATMOSPHERE, 2022, 13 (04)
  • [4] Machine Learning Methods to Forecast the Concentration of PM10 in Lublin, Poland
    Kujawska, Justyna
    Kulisz, Monika
    Oleszczuk, Piotr
    Cel, Wojciech
    ENERGIES, 2022, 15 (17)
  • [5] A Case Analysis of Dust Weather and Prediction of PM10 Concentration Based on Machine Learning at the Tibetan Plateau
    Tan, Changrong
    Chen, Qi
    Qi, Donglin
    Xu, Liang
    Wang, Jiayun
    ATMOSPHERE, 2022, 13 (06)
  • [6] Evaluation and Predicting PM10 Concentration Using Multiple Linear Regression and Machine Learning
    Son, Sanghun
    Kim, Jinsoo
    KOREAN JOURNAL OF REMOTE SENSING, 2020, 36 (06) : 1711 - 1720
  • [7] Variable Selection Based on Statistical Learning Approaches to Improve PM10 Concentration Forecasting
    Ben Ishak, A.
    JOURNAL OF ENVIRONMENTAL INFORMATICS, 2017, 30 (02) : 79 - 94
  • [8] A new optimized hybrid approach combining machine learning with WRF-CHIMERE model for PM10 concentration prediction
    Chelhaoui, Youssef
    El Ass, Khalid
    Lachatre, Mathieu
    Bouakline, Oumaima
    Khomsi, Kenza
    El Moussaoui, Tawfik
    Arrad, Mouad
    Eddaif, Abdelhamid
    Albergel, Armand
    MODELING EARTH SYSTEMS AND ENVIRONMENT, 2024, 10 (04) : 5687 - 5701
  • [9] Machine learning models to quantify the influence of PM10 aerosol concentration on global solar radiation prediction in South Africa
    Govindasamy, Tamara Rosemary
    Chetty, Naven
    CLEANER ENGINEERING AND TECHNOLOGY, 2021, 2
  • [10] Prediction of PM2.5 and PM10 in Chiang Mai Province: A Comparison of Machine Learning Models
    Thongrod, Thitaporn
    Lim, Apiradee
    Ingviya, Thammasin
    Owusu, Benjamin Atta
    2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 337 - 340