A Genetic Programming approach for feature selection in highly dimensional skewed data

被引:58
|
作者
Viegas, Felipe [2 ]
Rocha, Leonardo [1 ]
Goncalves, Marcos [2 ]
Mourao, Fernando [1 ]
Sa, Giovanni [1 ]
Salles, Thiago [2 ]
Andrade, Guilherme [2 ]
Sandin, Isac [1 ]
机构
[1] Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil
[2] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
Feature selection; Classification; Genetic Programming; CLASSIFICATION;
D O I
10.1016/j.neucom.2017.08.050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High dimensionality, also known as the curse of dimensionality, is still a major challenge for automatic classification solutions. Accordingly, several feature selection (FS) strategies have been proposed for dimensionality reduction over the years. However, they potentially perform poorly in face of unbalanced data. In this work, we propose a novel feature selection strategy based on Genetic Programming, which is resilient to data skewness issues, in other words, it works well with both, balanced and unbalanced data. The proposed strategy aims at combining the most discriminative feature sets selected by distinct feature selection metrics in order to obtain a more effective and impartial set of the most discriminative features, departing from the hypothesis that distinct feature selection metrics produce different (and potentially complementary) feature space projections. We evaluated our proposal in biological and textual datasets. Our experimental results show that our proposed solution not only increases the efficiency of the learning process, reducing up to 83% the size of the data space, but also significantly increases its effectiveness in some scenarios. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:554 / 569
页数:16
相关论文
共 50 条
  • [31] Feature Selection Using Geometric Semantic Genetic Programming
    Rosa, G. H.
    Papa, J. P.
    Papa, L. P.
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION), 2017, : 253 - 254
  • [32] Construction of Classifier with Feature Selection Based on Genetic Programming
    Purohit, Anuradha
    Chaudhari, Narendra S.
    Tiwari, Aruna
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [33] Feature Selection in High Dimensional Data by a Filter-Based Genetic Algorithm
    De Stefano, Claudio
    Fontanella, Francesco
    di Freca, Alessandra Scotto
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2017, PT I, 2017, 10199 : 506 - 521
  • [34] Exploring SLUG: Feature Selection Using Genetic Algorithms and Genetic Programming
    Rodrigues N.M.
    Batista J.E.
    Cava W.L.
    Vanneschi L.
    Silva S.
    SN Computer Science, 5 (1)
  • [35] Aggregating Data Sampling with Feature Subset Selection to Address Skewed Software Defect Data
    Gao, Kehan
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (9-10) : 1531 - 1550
  • [36] Feature Extraction with Automated Scale Selection in Skin Cancer Image Classification: A Genetic Programming Approach
    Ul Ain, Qurrat
    Xue, Bing
    Zhang, Mengjie
    Al-Sahaf, Harith
    PROCEEDINGS OF THE 2024 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2024, 2024, : 1363 - 1372
  • [37] TAGA: Tabu Asexual Genetic Algorithm embedded in a filter/filter feature selection approach for high-dimensional data
    Salesi, Sadegh
    Cosma, Georgina
    Mavrovouniotis, Michalis
    INFORMATION SCIENCES, 2021, 565 : 105 - 127
  • [38] A DC programming approach for feature selection in the Minimax Probability
    Yang, Liming
    Ju, Ribo
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2014, 7 (01) : 12 - 24
  • [39] Classifier design with feature selection and feature extraction using layered genetic programming
    Lin, Jung-Yi
    Ke, Hao-Ren
    Chien, Been-Chian
    Yang, Wei-Pang
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (02) : 1384 - 1393
  • [40] A New Approach for Wrapper Feature Selection Using Genetic Algorithm for Big Data
    Bouaguel, Waad
    INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2015, 2016, 5 : 75 - 83