Evaluation of data preprocessing and feature selection process for prediction of hourly PM10 concentration using long short-term memory models

被引:14
|
作者
Aksangur, Ipek [1 ]
Eren, Beytullah [1 ,2 ]
Erden, Caner [3 ,4 ]
机构
[1] Sakarya Univ, Fac Engn, Dept Environ Engn, Esentepe, Sakarya, Turkey
[2] Harran Univ, Halfeti Vocat Sch, Halfeti, Sanliurfa, Turkey
[3] Sakarya Univ Appl Sci, Fac Appl Sci, Dept Int Trade & Finance, Sakarya, Turkey
[4] Sakarya Univ Appl Sci, AI Res & Applicat Ctr, Sakarya, Turkey
关键词
Air quality; Data preprocessing; Feature selection; Particulate matter (PM 10 ); Long -short term memory (LSTM); AIR-POLLUTION; NEURAL-NETWORK; PM2.5; ARCHITECTURE; EXPOSURE; IMPACT; SO2;
D O I
10.1016/j.envpol.2022.119973
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Studies have confirmed that PM10, defined as respirable particles with diameters of 10 mu m and smaller, has adverse effects on human health and the environment. Various estimation methods are employed to determine the PM10 concentration using historical data on controlling PM10 air pollution, early warning, and protecting public health and the environment. The present study analyses different Long Short-Term Memory (LSTM) models that can predict hourly PM10 concentration. In parallel, the study also investigates the effectiveness of the data preprocessing and feature selection (DPFS) process on the prediction accuracy of the LSTM models. For this purpose, three different LSTM models, namely Vanilla, Bi-Directional, and Stacked, were developed. Then, a comprehensive data preprocessing stage is used to eliminate missing and erroneous data and outliers from real -world raw data, and a feature selection process is applied to extract unnecessary features. The LSTM models consider three air quality parameters, including SO2, O-3, and CO, and three meteorological factors, including relative humidity, wind direction, and wind speed. The prediction performances of the LSTM models are compared using the RMSE, MAE and R-2 performance index according to whether DPFS is used in the models or not. As a result, when the DPFS process was applied, the proposed LSTM models achieved high prediction performance and can be used to predict hourly PM10 concentrations. Overall, the DPFS process significantly enhanced the developed LSTM models' prediction performance. Furthermore, the proposed model might be a useful tool for city administrators to make decisions and improve air quality management efforts.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Short Term Prediction of PM10 Concentrations Using Seasonal Time Series Analysis
    Hamid, Hazrul Abdul
    Yahaya, Ahmad Shukri
    Ramli, Nor Azam
    Ul-Saufie, Ahmad Zia
    Yasin, Mohd Norazam
    3RD INTERNATIONAL CONFERENCE ON CIVIL AND ENVIRONMENTAL ENGINEERING FOR SUSTAINABILITY (ICONCEES 2015), 2016, 47
  • [42] Parameter prediction of oilfield gathering station reservoir based on feature selection and long short-term memory network
    Tian, Wende
    Qu, Jian
    Liu, Bin
    Cui, Zhe
    Hu, Minggang
    MEASUREMENT, 2023, 206
  • [43] Prediction of surface roughness based on a hybrid feature selection method and long short-term memory network in grinding
    Weicheng Guo
    Chongjun Wu
    Zishan Ding
    Qinzhi Zhou
    The International Journal of Advanced Manufacturing Technology, 2021, 112 : 2853 - 2871
  • [44] Prediction of surface roughness based on a hybrid feature selection method and long short-term memory network in grinding
    Guo, Weicheng
    Wu, Chongjun
    Ding, Zishan
    Zhou, Qinzhi
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2021, 112 (9-10): : 2853 - 2871
  • [45] Air quality prediction based on Long Short-Term Memory Model with advanced feature selection and hyperparameter optimization
    Wu, Huiyong
    Yang, Tongtong
    Wu, Harris
    Li, Hongkun
    Zhou, Ziwei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (04) : 5971 - 5985
  • [46] Forecasting Hourly Solar Irradiance Using Long Short-Term Memory (LSTM) Network
    Obiora, Chibuzor N.
    Ali, Ahmed
    Hasan, Ali N.
    2020 11TH INTERNATIONAL RENEWABLE ENERGY CONGRESS (IREC), 2020,
  • [47] Malware Classification using Long Short-term Memory Models
    Dang, Dennis
    Di Troia, Fabio
    Stamp, Mark
    ICISSP: PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY, 2021, : 743 - 752
  • [48] Investigating Hourly Global Horizontal Irradiance Forecasting Using Long Short-Term Memory
    Yamani, Asma Z.
    Alyami, Sarah N.
    2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
  • [49] Prediction of pedestrian trajectory based on long short-term memory of data
    Ono, Tomoya
    Kanamaru, Takashi
    2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 1676 - 1679
  • [50] Intrusion Detection using Deep Learning Long Short-term Memory with Wrapper Feature Selection Method
    Al Azwari, Sana
    Turabieh, Hamza
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 553 - 558