Feature subset selection for data and feature streams: a review

被引:0
|
作者
Carlos Villa-Blanco
Concha Bielza
Pedro Larrañaga
机构
[1] Universidad Politécnica de Madrid,Computational Intelligence Group, Departamento de Inteligencia Artificial
来源
关键词
Data streams; Feature streams; Dynamic environments; Feature subset selection; Supervised classification; Clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Real-world problems are commonly characterized by a high feature dimensionality, which hinders the modelling and descriptive analysis of the data. However, some of these data may be irrelevant or redundant for the learning process. Different approaches can be used to reduce this information, improving not only the speed of building models but also their performance and interpretability. In this review, we focus on feature subset selection (FSS) techniques, which select a subset of the original feature set without making any transformation on the attributes. Traditional batch FSS algorithms may not be adequate to efficiently handle large volumes of data, either because memory problems arise or data are received in a sequential manner. Thus, this article aims to survey the state of the art of incremental FSS algorithms, which can perform more efficiently under these circumstances. Different strategies are described, such as incrementally updating feature weights, applying information theory or using rough set-based FSS, as well as multiple supervised and unsupervised learning tasks where the application of FSS is interesting.
引用
收藏
页码:1011 / 1062
页数:51
相关论文
共 50 条
  • [41] A Plug-in Feature Extraction and Feature Subset Selection Algorithm for Classification of Medicinal Brain Image Data
    Veeramuthu, A.
    Meenakshi, S.
    Kameshwaran, A.
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [42] Feature Subset Selection within a Simulated Annealing Data Mining Algorithm
    Debuse J.C.W.
    Rayward-Smith V.J.
    Journal of Intelligent Information Systems, 1997, 9 (1) : 57 - 81
  • [43] Evolutionary Multilabel Feature Selection Using Promising Feature Subset Generation
    Lee, Jaesung
    Seo, Wangduk
    Han, Ho
    Kim, Dae-Won
    JOURNAL OF SENSORS, 2018, 2018
  • [44] Parallel fractional dominance MOEAs for feature subset selection in big data
    Vivek, Yelleti
    Ravi, Vadlamani
    Suganthan, Ponnuthurai Nagaratnam
    Krishna, P. Radha
    SWARM AND EVOLUTIONARY COMPUTATION, 2024, 91
  • [45] A Hybridization Approach for Optimal Feature Subset Selection in High Dimensional Data
    Sharmili, K. C.
    Chilambuchelvan, A.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 (06) : 949 - 970
  • [46] Cascading GA & CFS for Feature Subset selection in Medical Data Mining
    Karegowda, Asha Gowda
    Jayaram, M. A.
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1428 - 1431
  • [47] Feature salience definition and estimation and its use in feature subset selection
    Richards, G.
    Brazier, K.
    Wang, W.
    INTELLIGENT DATA ANALYSIS, 2006, 10 (01) : 3 - 21
  • [48] A feature subset selection algorithm based on feature activity and improved GA
    Li, Juan
    2015 11TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2015, : 206 - 210
  • [49] Feature Selection in High Dimensional Data: A Review
    Silaich, Sarita
    Gupta, Suneet
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 703 - 717
  • [50] A feature selection approach to estimate discrimination capability of feature subset category
    Song, Enmin
    Huang, Dongshan
    Ma, Guangzhi
    Xiao, Qiang
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2011, 39 (02): : 1 - 5