An integrated feature selection and machine learning framework for PM10 concentration prediction

被引:0
|
作者
Kalantari, Elham [1 ]
Gholami, Hamid [1 ]
Malakooti, Hossein [2 ]
Kaskaoutis, Dimitris G. [3 ,4 ]
Saneei, Poorya [5 ]
机构
[1] Univ Hormozgan, Dept Nat Resources Engn, Bandar Abbas, Hormozgan, Iran
[2] Univ Hormozgan, Fac Marine Sci & Technol, Dept Marine & Atmospher Sci Non Biol, Bandar Abbas, Iran
[3] Univ Western Macedonia, Dept Chem Engn, Kozani 50100, Greece
[4] Inst Environm Res & Sustainable Dev, Natl Observ Athens, Athens 15236, Greece
[5] Iran Univ Sci & Technol, Dept Comp Engn, Tehran, Iran
关键词
Air pollution; Feature selection; Machine learning; PM10; Dust; Zabol; DUST STORMS; PM2.5; CONCENTRATIONS; PARTICULATE MATTER; RIDGE-REGRESSION; SISTAN REGION; COMPONENT ANALYSIS; POLLUTION; MORTALITY; CANCER; IRAN;
D O I
10.1016/j.apr.2025.102456
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Sistan Basin, east Iran is a major dust source, presenting significant atmospheric, ecological, socio-economic, and health challenges. This study employed machine learning (ML) algorithms, including Random Forest (RF), KNearest Neighbor (KNN), Weighted K-Nearest Neighbor (WKNN), Support Vector Regression (SVR), and Least Absolute Shrinkage and Selection Operator (LASSO), to model and predict PM10 concentrations in Zabol City (2013-2022), utilizing independent meteorological variables such as temperature, relative humidity, wind speed and direction. Feature selection methods - Filter (Information Gain, F-Test, Correlation Coefficient), Wrapper (Recursive Feature Elimination, Sequential Forward/Backward Selection), and Embedded (LASSO, Elastic Net, Ridge Regression, RF Importance) - were applied to identify significant predictors, with embedded methods providing the best balance of simplicity, accuracy, and cost-efficiency. Among the models, RF demonstrated the highest seasonal performance (R2 = 0.75) during summer. RF's prediction R2 values for PM10 remained above 0.5 in all seasons, consistently outperformed the other models. The WKNN model performed reasonably well across all seasons, ranking second among the models, while the LASSO model demonstrated weaker performance. The SVR model showed satisfactory performance in specific seasons, such as summer and autumn. A common feature of all models was their better performance during summer. Importantly, the models relied solely on readily available meteorological data, enabling accurate predictions of PM10 in this arid region of eastern Iran. The findings highlight the potential of ML techniques for developing air pollution prediction and warning systems, offering valuable support to policymakers in the design of effective pollution control strategies and safeguarding public health.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] FORECASTING PM10 CONCENTRATIONS BASED ON MACHINE AND DEEP LEARNING
    Isikdag, Umit
    FRESENIUS ENVIRONMENTAL BULLETIN, 2022, 31 (8A): : 8385 - 8391
  • [22] Performance of machine learning models to forecast PM10 levels
    Mampitiyaa, Lakindu
    Rathnayake, Namal
    Hoshinoc, Yukinobu
    Rathnayake, Upaka
    METHODSX, 2024, 12
  • [23] Machine Learning Techniques for PM10 Levels Forecast in Bogota
    Mejia Martinez, Nicolas
    Melissa Montes, Laura
    Mura, Ivan
    Felipe Franco, Juan
    2018 ICAI WORKSHOPS (ICAIW), 2018,
  • [24] The Determination of Recommended Concentration of Outdoor PM10 in the Calculation of Filter Selection
    Fan, Yuesheng
    Si, Pengfei
    Li, Angui
    Li, Boweng
    2010 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING (ICBBE 2010), 2010,
  • [25] PM10 Concentration Forecast Based on Wavelet Support Vector Machine
    Li, Yong
    Tao, Yan
    2017 INTERNATIONAL CONFERENCE ON SENSING, DIAGNOSTICS, PROGNOSTICS, AND CONTROL (SDPC), 2017, : 383 - 386
  • [26] Hybrid Prediction Model of Air Pollutant Concentration for PM2.5 and PM10
    Ma, Yanrong
    Ma, Jun
    Wang, Yifan
    ATMOSPHERE, 2023, 14 (07)
  • [27] Prediction of PM10 concentration on the basis of high resolution weather forecasting
    Klingner, Matthias
    Saehn, Elke
    METEOROLOGISCHE ZEITSCHRIFT, 2008, 17 (03) : 263 - 272
  • [28] Prediction of PM10 concentration in Seoul, Korea using Bayesian network
    Jo, Minjoo
    Oh, Rosy
    Oh, Man-Suk
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2023, 30 (05) : 517 - 530
  • [29] Enhancing software defect prediction: a framework with improved feature selection and ensemble machine learning
    Ali, Misbah
    Mazhar, Tehseen
    Al-Rasheed, Amal
    Shahzad, Tariq
    Ghadi, Yazeed Yasin
    Khan, Muhammad Amir
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [30] Prediction of Outpatient Visits for Upper Respiratory Tract Infections by Machine Learning of PM2.5 and PM10 Levels in Taiwan
    Yang, Pei-Hsuan
    Hsieh, Tren
    Lin, Gen-Min
    Chen, Mei-Juan
    Yeh, Chia-Hung
    Huang, Zhi-Xiang
    Yang, Chieh-Ming
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN (ICCE-TW), 2018,