An integrated feature selection and machine learning framework for PM10 concentration prediction

被引:0
|
作者
Kalantari, Elham [1 ]
Gholami, Hamid [1 ]
Malakooti, Hossein [2 ]
Kaskaoutis, Dimitris G. [3 ,4 ]
Saneei, Poorya [5 ]
机构
[1] Univ Hormozgan, Dept Nat Resources Engn, Bandar Abbas, Hormozgan, Iran
[2] Univ Hormozgan, Fac Marine Sci & Technol, Dept Marine & Atmospher Sci Non Biol, Bandar Abbas, Iran
[3] Univ Western Macedonia, Dept Chem Engn, Kozani 50100, Greece
[4] Inst Environm Res & Sustainable Dev, Natl Observ Athens, Athens 15236, Greece
[5] Iran Univ Sci & Technol, Dept Comp Engn, Tehran, Iran
关键词
Air pollution; Feature selection; Machine learning; PM10; Dust; Zabol; DUST STORMS; PM2.5; CONCENTRATIONS; PARTICULATE MATTER; RIDGE-REGRESSION; SISTAN REGION; COMPONENT ANALYSIS; POLLUTION; MORTALITY; CANCER; IRAN;
D O I
10.1016/j.apr.2025.102456
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The Sistan Basin, east Iran is a major dust source, presenting significant atmospheric, ecological, socio-economic, and health challenges. This study employed machine learning (ML) algorithms, including Random Forest (RF), KNearest Neighbor (KNN), Weighted K-Nearest Neighbor (WKNN), Support Vector Regression (SVR), and Least Absolute Shrinkage and Selection Operator (LASSO), to model and predict PM10 concentrations in Zabol City (2013-2022), utilizing independent meteorological variables such as temperature, relative humidity, wind speed and direction. Feature selection methods - Filter (Information Gain, F-Test, Correlation Coefficient), Wrapper (Recursive Feature Elimination, Sequential Forward/Backward Selection), and Embedded (LASSO, Elastic Net, Ridge Regression, RF Importance) - were applied to identify significant predictors, with embedded methods providing the best balance of simplicity, accuracy, and cost-efficiency. Among the models, RF demonstrated the highest seasonal performance (R2 = 0.75) during summer. RF's prediction R2 values for PM10 remained above 0.5 in all seasons, consistently outperformed the other models. The WKNN model performed reasonably well across all seasons, ranking second among the models, while the LASSO model demonstrated weaker performance. The SVR model showed satisfactory performance in specific seasons, such as summer and autumn. A common feature of all models was their better performance during summer. Importantly, the models relied solely on readily available meteorological data, enabling accurate predictions of PM10 in this arid region of eastern Iran. The findings highlight the potential of ML techniques for developing air pollution prediction and warning systems, offering valuable support to policymakers in the design of effective pollution control strategies and safeguarding public health.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Machine Learning- and Feature Selection-Enabled Framework for Accurate Crop Yield Prediction
    Gupta, Sandeep
    Geetha, Angelina
    Sankaran, K. Sakthidasan
    Zamani, Abu Sarwar
    Ritonga, Mahyudin
    Raj, Roop
    Ray, Samrat
    Mohammed, Hussien Sobahi
    JOURNAL OF FOOD QUALITY, 2022, 2022
  • [42] Forecasting PM10 Concentrations in the Caribbean Area Using Machine Learning Models
    Plocoste, Thomas
    Laventure, Sylvio
    ATMOSPHERE, 2023, 14 (01)
  • [43] Leveraging Satellite Data for Predicting PM10 Concentration with Machine Learning Models: A Study in the Plains of North Bengal, India
    Das, Ayan
    Sahu, Manoranjan
    AEROSOL AND AIR QUALITY RESEARCH, 2024, 24 (12)
  • [44] Leveraging Satellite Data for Predicting PM10 Concentration with Machine Learning Models: A Study in the Plains of North Bengal, India
    Das, Ayan
    Sahu, Manoranjan
    Aerosol and Air Quality Research, 24 (12):
  • [45] Does PM10 influence the prediction of PM2.5?
    Choudhary, Rashmi
    Agarwal, Amit
    2022 SMART CITIES SYMPOSIUM PRAGUE (SCSP), 2022,
  • [46] An integrated machine learning framework for hospital readmission prediction
    Jiang, Shancheng
    Chin, Kwai-Sang
    Qu, Gang
    Tsui, Kwok L.
    KNOWLEDGE-BASED SYSTEMS, 2018, 146 : 73 - 90
  • [47] Prediction of PM10 Concentration in South Korea Using Gradient Tree Boosting Models
    Qadeer, Khaula
    Jeon, Moongu
    ICVISP 2019: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING, 2019,
  • [48] Prediction of PM10 Concentrations in the Ningdong Base
    Li, Fengjun
    Guo, Xiaole
    PROCEEDINGS OF THE 2016 5TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY AND ENVIRONMENT ENGINEERING (ICSEEE 2016), 2016, 63 : 497 - 500
  • [49] Prediction of short and medium term PM10 concentration using artificial neural networks
    Schornobay-Lui, Elaine
    Alexandrina, Eduardo Carlos
    Aguiar, Monica Lopes
    Hanisch, Werner Siegfried
    Correa, Edinalda Moreira
    Correa, Nivaldo Aparecido
    MANAGEMENT OF ENVIRONMENTAL QUALITY, 2019, 30 (02) : 414 - 436
  • [50] 2-Days Ahead PM10 Prediction in Milan with Lazy Learning
    Corani, Giorgio
    Barazzetta, Stefano
    ERCIM NEWS, 2005, (61): : 27 - 28