Clustering-Guided Particle Swarm Feature Selection Algorithm for High-Dimensional Imbalanced Data With Missing Values

被引:59
|
作者
Zhang, Yong [1 ]
Wang, Yan-Hu [1 ]
Gong, Dun-Wei [1 ]
Sun, Xiao-Yan [1 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Class imbalance; feature selection (FS); fuzzy clustering; missing value; particle swarm optimization (PSO); SENSITIVE FEATURE-SELECTION; MUTUAL INFORMATION; DIFFERENTIAL EVOLUTION; GENETIC ALGORITHM; OPTIMIZATION; CLASSIFICATION; MACHINE;
D O I
10.1109/TEVC.2021.3106975
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection (FS) in data with class imbalance or missing values has received much attention from researchers due to their universality in real-world applications. However, for data with both the two characteristics above, there is still a lack of the corresponding FS algorithm. Due to the complex coupling relationship between missing data and class imbalance, the need for better FS method becomes essential. To tackle high-dimensional imbalanced data with missing values, this article studies a new evolutionary FS method. First, an improved F-measure based on filling risk (RF-measure) is defined to evaluate the influence of missing data on the performance of FS in the case of class imbalance. Following that taking the RF-measure as an objective function, a particle swarm optimization-based FS method with fuzzy clustering (PSOFS-FC) is proposed. Two new problem-specific operators or strategies, i.e., the swarm initialization strategy guided by fuzzy clustering and the local pruning operator based on feature importance, are developed to improve the performance of PSOFS-FC. Compared with state-of-the-art FS algorithms on several public datasets, experimental results show that PSOFS-FC can achieve excellent classification performance with relatively less running time, indicating its superiority on tackling high-dimensional imbalanced data with missing values.
引用
收藏
页码:616 / 630
页数:15
相关论文
共 50 条
  • [21] Clustering of imbalanced high-dimensional media data
    Brodinova, Sarka
    Zaharieva, Maia
    Filzmoser, Peter
    Ortner, Thomas
    Breiteneder, Christian
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (02) : 261 - 284
  • [22] Clustering of imbalanced high-dimensional media data
    Šárka Brodinová
    Maia Zaharieva
    Peter Filzmoser
    Thomas Ortner
    Christian Breiteneder
    Advances in Data Analysis and Classification, 2018, 12 : 261 - 284
  • [23] Surrogate Sample-Assisted Particle Swarm Optimization for Feature Selection on High-Dimensional Data
    Song, Xianfang
    Zhang, Yong
    Gong, Dunwei
    Liu, Hui
    Zhang, Wanqiu
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (03) : 595 - 609
  • [24] On online high-dimensional spherical data clustering and feature selection
    Amayri, Ola
    Bouguila, Nizar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (04) : 1386 - 1398
  • [25] The classification method based on evolutionary algorithm for high-dimensional imbalanced missing data
    Liu, Yi
    Li, Gengsong
    Li, Xiang
    Qin, Wei
    Zheng, Qibin
    Ren, Xiaoguang
    ELECTRONICS LETTERS, 2023, 59 (12)
  • [26] Particle swarm optimizer for variable weighting in clustering high-dimensional data
    Lu, Yanping
    Wang, Shengrui
    Li, Shaozi
    Zhou, Changle
    MACHINE LEARNING, 2011, 82 (01) : 43 - 70
  • [27] Particle swarm optimizer for variable weighting in clustering high-dimensional data
    Yanping Lu
    Shengrui Wang
    Shaozi Li
    Changle Zhou
    Machine Learning, 2011, 82 : 43 - 70
  • [28] Particle Swarm Optimizer for Variable Weighting in Clustering High-dimensional Data
    Lu, Yanping
    Wang, Shengrui
    Li, Shaozi
    Zhou, Changle
    2009 IEEE SWARM INTELLIGENCE SYMPOSIUM, 2009, : 37 - +
  • [29] A Clustering Algorithm for High-Dimensional Nonlinear Feature Data with Applications
    Jiang H.
    Wang G.
    Gao J.
    Gao Z.
    Gao R.
    Guo Q.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2017, 51 (12): : 49 - 55and90
  • [30] Guided Particle Adaptation PSO for Feature Selection on High-dimensional Classification
    Huang, Mingshen
    Yuan, Weiwei
    Guan, Donghai
    Lu, Mengze
    Koc, Cetin Kaya
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14862 : 14 - 26