Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm

被引:22
|
作者
Zhang, Biao [1 ]
Zhang, Ying [2 ]
Jiang, Xuchu [2 ]
机构
[1] Liaocheng Univ, Sch Comp Sci, Liaocheng 252000, Shandong, Peoples R China
[2] Zhongnan Univ Econ & Law, Sch Stat & Math, Wuhan 430073, Peoples R China
关键词
AIR-QUALITY; CHEMISTRY;
D O I
10.1038/s41598-022-13498-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Ozone is one of the most important air pollutants, with significant impacts on human health, regional air quality and ecosystems. In this study, we use geographic information and environmental information of the monitoring site of 5577 regions in the world from 2010 to 2014 as feature input to predict the long-term average ozone concentration of the site. A Bayesian optimization-based XGBoost-RFE feature selection model BO-XGBoost-RFE is proposed, and a variety of machine learning algorithms are used to predict ozone concentration based on the optimal feature subset. Since the selection of the underlying model hyperparameters is involved in the recursive feature selection process, different hyperparameter combinations will lead to differences in the feature subsets selected by the model, so that the feature subsets obtained by the model may not be optimal solutions. We combine the Bayesian optimization algorithm to adjust the parameters of recursive feature elimination based on XGBoost to obtain the optimal parameter combination and the optimal feature subset under the parameter combination. Experiments on long-term ozone concentration prediction on a global scale show that the prediction accuracy of the model after Bayesian optimized XGBoost-RFE feature selection is higher than that based on all features and on feature selection with Pearson correlation. Among the four prediction models, random forest obtained the highest prediction accuracy. The XGBoost prediction model achieved the greatest improvement in accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Construction and Application of ReliefF-RFE Feature Selection Algorithm for Hyperspectral Image Classification
    Xiang Song-yang
    Xu Zhang-hua
    Zhang Yi-wei
    Zhang Qi
    Zhou Xin
    Yu Hui
    Li Bin
    Li Yi-fan
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42 (10) : 3283 - 3290
  • [32] Genetic Algorithm-based Feature Selection for Depression Scale Prediction
    Lee, Seung-Ju
    Moon, Hyun-Ji
    Kim, Da-Jung
    Yoon, Yourim
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 65 - 66
  • [33] Global chaotic bat algorithm for feature selection
    Li, Ying
    Cui, Xueting
    Fan, Jiahao
    Wang, Tan
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (17): : 18754 - 18776
  • [34] Global chaotic bat algorithm for feature selection
    Ying Li
    Xueting Cui
    Jiahao Fan
    Tan Wang
    The Journal of Supercomputing, 2022, 78 : 18754 - 18776
  • [35] Hybrid Global Optimization Algorithm for Feature Selection
    Azar, Ahmad Taher
    Khan, Zafar Iqbal
    Amin, Syed Umar
    Fouad, Khaled M.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 2021 - 2037
  • [36] Prediction of Banking Customer Churn Based on XGBoost with Feature Fusion
    Hu, Zhongyi
    Dong, Fangrui
    Wu, Jiang
    Misir, Mustafa
    E-BUSINESS: NEW CHALLENGES AND OPPORTUNITIES FOR DIGITAL-ENABLED INTELLIGENT FUTURE, PT III, WHICEB 2024, 2024, 517 : 159 - 167
  • [37] Product Marketing Prediction based on XGboost and LightGBM Algorithm
    Liang, Yunxin
    Wu, Jiyu
    Wang, Wei
    Cao, Yujun
    Zhong, Biliang
    Chen, Zhenkun
    Li, Zhenzhang
    2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION (AIPR 2019), 2019, : 150 - 153
  • [38] Gene Expression Value Prediction Based on XGBoost Algorithm
    Li, Wei
    Yin, Yanbin
    Quan, Xiongwen
    Zhang, Han
    FRONTIERS IN GENETICS, 2019, 10
  • [39] Hematoma expansion prediction based on SMOTE and XGBoost algorithm
    Li, Yan
    Du, Chaonan
    Ge, Sikai
    Zhang, Ruonan
    Shao, Yiming
    Chen, Keyu
    Li, Zhepeng
    Ma, Fei
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [40] Advance Hybrid RF-GBC-RFE Wrapper-Based Feature Selection Techniques for Prediction of Autistic Disorder
    Radhika, C.
    Priya, N.
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (01) : 503 - 510