A Genetic Programming approach for feature selection in highly dimensional skewed data

被引:58
|
作者
Viegas, Felipe [2 ]
Rocha, Leonardo [1 ]
Goncalves, Marcos [2 ]
Mourao, Fernando [1 ]
Sa, Giovanni [1 ]
Salles, Thiago [2 ]
Andrade, Guilherme [2 ]
Sandin, Isac [1 ]
机构
[1] Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil
[2] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
Feature selection; Classification; Genetic Programming; CLASSIFICATION;
D O I
10.1016/j.neucom.2017.08.050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensionality, also known as the curse of dimensionality, is still a major challenge for automatic classification solutions. Accordingly, several feature selection (FS) strategies have been proposed for dimensionality reduction over the years. However, they potentially perform poorly in face of unbalanced data. In this work, we propose a novel feature selection strategy based on Genetic Programming, which is resilient to data skewness issues, in other words, it works well with both, balanced and unbalanced data. The proposed strategy aims at combining the most discriminative feature sets selected by distinct feature selection metrics in order to obtain a more effective and impartial set of the most discriminative features, departing from the hypothesis that distinct feature selection metrics produce different (and potentially complementary) feature space projections. We evaluated our proposal in biological and textual datasets. Our experimental results show that our proposed solution not only increases the efficiency of the learning process, reducing up to 83% the size of the data space, but also significantly increases its effectiveness in some scenarios. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:554 / 569
页数:16
相关论文
共 50 条
  • [1] Genetic Programming for Feature Selection and Construction to High-Dimensional Data
    Ma, Jianbin
    Zhu, Man
    2024 4TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND INTELLIGENT SYSTEMS ENGINEERING, MLISE 2024, 2024, : 196 - 200
  • [2] A Genetic Programming Approach Applied to Feature Selection from Medical Data
    Castellanos-Garzon, Jose A.
    Ramos, Juan
    Mezquita Martin, Yeray
    de Paz, Juan F.
    Costa, Ernesto
    PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 803 : 200 - 207
  • [3] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Bing Xue
    Mengjie Zhang
    Memetic Computing, 2016, 8 : 3 - 15
  • [4] Genetic programming for feature construction and selection in classification on high-dimensional data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    MEMETIC COMPUTING, 2016, 8 (01) : 3 - 15
  • [5] Genetic Programming as a Feature Selection Algorithm
    Suarez, Ranyart R.
    Maria Valencia-Ramirez, Jose
    Graff, Mario
    2014 IEEE INTERNATIONAL AUTUMN MEETING ON POWER, ELECTRONICS AND COMPUTING (ROPEC), 2014,
  • [6] A genetic programming approach to feature selection and classification of instantaneous cognitive states
    Ramirez, Rafael
    Puiggros, Montserrat
    APPLICATIONS OF EVOLUTIONARY COMPUTING, PROCEEDINGS, 2007, 4448 : 311 - +
  • [7] Genetic programming with a genetic algorithm for feature construction and selection
    Smith M.G.
    Bull L.
    Genetic Programming and Evolvable Machines, 2005, 6 (3) : 265 - 281
  • [8] A filter-based feature construction and feature selection approach for classification using Genetic Programming
    Ma, Jianbin
    Gao, Xiaoying
    KNOWLEDGE-BASED SYSTEMS, 2020, 196
  • [9] Comparison of Feature Selection Methods in Text Classification on Highly Skewed Datasets
    Asim, Muhammad Nabeel
    Wasim, Muhammad
    Ali, Muhammad Sajid
    Rehman, Abdur
    2017 FIRST INTERNATIONAL CONFERENCE ON LATEST TRENDS IN ELECTRICAL ENGINEERING AND COMPUTING TECHNOLOGIES (INTELLECT), 2017,
  • [10] A new representation in genetic programming with hybrid feature ranking criterion for high-dimensional feature selection
    Li, Jiayi
    Zhang, Fan
    Ma, Jianbin
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (04)