A parallel feature selection method based on NMI-XGBoost and distance correlation for typhoon trajectory prediction

被引:3
|
作者
Qiao, Baiyou [1 ]
Wu, Jiaqi [1 ]
Wang, Rui [2 ]
Hao, Yuanqing [1 ]
Wang, Peirui [1 ]
Han, Donghong [1 ]
Wu, Gang [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
[2] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110169, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 08期
基金
中国国家自然科学基金;
关键词
Feature selection; NMI; XGBoost; Distance correlation; Spark; ASSOCIATION; DEPENDENCE; MODEL;
D O I
10.1007/s11227-023-05863-3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Typhoon trajectory related data involve many factors, such as atmospheric factors, oceanic factors, and physical factors. It has the characteristics of high dimension, strong spatio-temporal correlation, and nonlinear correlation, which increases the difficulty of typhoon trajectory prediction. Using feature selection approaches to select appropriate prediction factors becomes an important means to reduce the dimension of typhoon trajectory related data and improve the performance and accuracy of typhoon trajectory prediction methods. However, the existing feature selection methods based on linear correlation analysis cannot well depict the nonlinear correlation between data features, which results in low accuracy of feature selection. The feature selection methods based on nonlinear correlation analysis are computationally expensive, which affects the timeliness of feature selection. To solve the problem, we propose a parallel feature selection method NX-Spark-DC based on the Spark platform for typhoon trajectory related data. The method firstly filters out the redundant features of typhoon related data by normalized mutual information (NMI) method, subsequently eliminates the useless features by XGBoost machine learning model, and thus reducing the dimension of typhoon related data. On this basis, an improved Spark-based parallel distance correlation algorithm (Spark-DC) is proposed to select the feature combinations with strong correlation. A series of experimental results show that NX-Spark-DC method has high execution efficiency and accuracy, which is significantly better than the existing methods.
引用
收藏
页码:11293 / 11321
页数:29
相关论文
共 50 条
  • [1] A parallel feature selection method based on NMI-XGBoost and distance correlation for typhoon trajectory prediction
    Baiyou Qiao
    Jiaqi Wu
    Rui Wang
    Yuanqing Hao
    Peirui Wang
    Donghong Han
    Gang Wu
    The Journal of Supercomputing, 2024, 80 : 11293 - 11321
  • [2] Correlation based feature selection method
    Michalak, K.
    Kwasnicka, H.
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2010, 2 (05) : 319 - 332
  • [3] Feature selection based on distance correlation: a filter algorithm
    Tan, Hongwei
    Wang, Guodong
    Wang, Wendong
    Zhang, Zili
    JOURNAL OF APPLIED STATISTICS, 2022, 49 (02) : 411 - 426
  • [4] A Feature Selection Method Based on Feature Correlation Networks
    Savic, Milos
    Kurbalija, Vladimir
    Ivanovic, Mirjana
    Bosnic, Zoran
    MODEL AND DATA ENGINEERING (MEDI 2017), 2017, 10563 : 248 - 261
  • [5] Pedestrian Trajectory Prediction Method Based on Feature Fusion
    Yang, Tian
    Wang, Gang
    Lai, Jian
    Wang, Yang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [6] Distance Correlation-Based Feature Selection in Random Forest
    Ratnasingam, Suthakaran
    Munoz-Lopez, Jose
    ENTROPY, 2023, 25 (09)
  • [7] A Multi-factor Correlation Analysis Method for Typhoon Moving Track Based on NMI and HSIC0
    Qiao B.-Y.
    Hao Y.-Q.
    Tang Z.
    Wang R.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2023, 44 (09): : 1234 - 1244
  • [8] Intrusion Feature Selection Method based on Neighborhood Distance
    Du Shao-Bo
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS, ELECTRONICS AND CONTROL (ICCSEC), 2017, : 748 - 751
  • [9] An Aircraft Trajectory Prediction Method Based on Trajectory Clustering and a Spatiotemporal Feature Network
    Wu, You
    Yu, Hongyi
    Du, Jianping
    Liu, Bo
    Yu, Wanting
    ELECTRONICS, 2022, 11 (21)
  • [10] A Novel Feature Selection Method Based on Correlation-Based Feature Selection in Cancer Recognition
    Lu, Xinguo
    Peng, Xianghua
    Deng, Yong
    Feng, Bingtao
    Liu, Ping
    Liao, Bo
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2014, 11 (02) : 427 - 433