A tutorial-based survey on feature selection: Recent advancements on feature selection

被引:21
|
作者
Moslemi, Amir [1 ]
机构
[1] Sunnybrook Hlth Sci Ctr, Imaging Res & Phys Sci, Toronto, ON M4N 3M5, Canada
关键词
Feature selection; Matrix factorization; Sparse representation learning; Information theory; Evolutionary computation; Reinforcement learning; UNSUPERVISED FEATURE-SELECTION; SUPERVISED FEATURE-SELECTION; NONNEGATIVE MATRIX FACTORIZATION; PARTICLE SWARM OPTIMIZATION; EFFICIENT FEATURE-SELECTION; SPARSE FEATURE-SELECTION; LABEL FEATURE-SELECTION; HESITANT FUZZY-SETS; GENETIC ALGORITHM; MUTUAL INFORMATION;
D O I
10.1016/j.engappai.2023.107136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Curse of dimensionality is known as big challenges in data mining, pattern recognition, computer vison and machine learning in recent years. Feature selection and feature extraction are two main approaches to circumvent this challenge. The main objective in feature selection is to remove the redundant features and preserve the relevant features in order to improve the learning algorithm performance. This survey provides a comprehensive overview of state-of-art feature selection techniques including mathematical formulas and fundamental algorithm to facilitate understanding. This survey encompasses different approaches of feature selection which can be categorized to five domains including: A) subspace learning which involves matrix factorization and matrix projection, B) sparse representation learning which includes compressed sensing and dictionary learning, C) information theory which covers multi-label neighborhood entropy, symmetrical un-certainty, Monte Carlo and Markov blanket, D) evolutionary computational algorithms including Genetic algo-rithm (GA), particle swarm optimization (PSO), Ant colony (AC) and Grey wolf optimization (GWO), and E) reinforcement learning techniques. This survey can be helpful for researchers to acquire deep understanding of feature selection techniques and choose a proper feature selection technique. Moreover, researcher can choose one of the A, B, C, D and E domains to become deep in this field for future study. A potential avenue for future research could involve exploring methods to reduce computational complexity while simultaneously maintaining performance efficiency. This would involve investigating ways to achieve a more efficient balance between computational resources and overall performance. For matrix-based techniques, the main limitation of these techniques lies in the need to tune the coefficients of the regularization terms, as this process can be challenging and time-consuming. For evolutionary computational techniques, getting stuck in local minimum and finding an appropriate objective function are two main limitations.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] A survey on online feature selection with streaming features
    Hu, Xuegang
    Zhou, Peng
    Li, Peipei
    Wang, Jing
    Wu, Xindong
    FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (03) : 479 - 493
  • [42] A Survey on Evolutionary Computation Approaches to Feature Selection
    Xue, Bing
    Zhang, Mengjie
    Browne, Will N.
    Yao, Xin
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2016, 20 (04) : 606 - 626
  • [43] A survey on online feature selection with streaming features
    Xuegang Hu
    Peng Zhou
    Peipei Li
    Jing Wang
    Xindong Wu
    Frontiers of Computer Science, 2018, 12 : 479 - 493
  • [44] A survey on feature selection methods for mixed data
    Saúl Solorio-Fernández
    J. Ariel Carrasco-Ochoa
    José Francisco Martínez-Trinidad
    Artificial Intelligence Review, 2022, 55 : 2821 - 2846
  • [45] A Survey on Sparse Learning Models for Feature Selection
    Li, Xiaoping
    Wang, Yadi
    Ruiz, Ruben
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (03) : 1642 - 1660
  • [46] Feature selection algorithms: A survey and experimental evaluation
    Molina, LC
    Belanche, L
    Nebot, A
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 306 - 313
  • [47] A survey on feature selection methods for mixed data
    Solorio-Fernandez, Saul
    Carrasco-Ochoa, J. Ariel
    Martinez-Trinidad, Jose Francisco
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2821 - 2846
  • [48] Classification Algorithm Based on Feature Selection and Samples Selection
    Xu, Yitian
    Zhen, Ling
    Yang, Liming
    Wang, Laisheng
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 2, PROCEEDINGS, 2009, 5552 : 631 - 638
  • [49] A Survey on Particle Swarm Optimization in Feature Selection
    Kothari, Vipul
    Anuradha, J.
    Shah, Shreyak
    Mittal, Prerit
    GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 192 - 201
  • [50] A python based tutorial on prognostics and health management using vibration signal: signal processing, feature extraction and feature selection
    Jinwoo Sim
    Jinhong Min
    Doyeon Kim
    Seong Hee Cho
    Seokgoo Kim
    Joo-Ho Choi
    Journal of Mechanical Science and Technology, 2022, 36 : 4083 - 4097