A GA-based Feature Selection for High-dimensional Data Clustering

被引:5
|
作者
Sun, Mei [1 ]
Xiong, Langhuan [1 ]
Sun, Haojun [1 ]
Jiang, Dazhi [1 ]
机构
[1] Shantou Univ, Dept Comp Sci & Technol, Shantou 515063, Peoples R China
关键词
feature selection; clustering; genetic algorithms; high-dimensional data;
D O I
10.1109/WGEC.2009.140
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-dimensional data clustering is an open problem in modern data mining. This paper proposed a new genetic algorithm-based feature selection for high-dimensional data clustering, called GA-FSFclustering. This approach searches effective feature subsets for clustering in all features by genetic algorithm. The candidate features and cluster centers are real number encoded. A new criterion for evaluating feature subsets is employed as the fitness function. The experimental results indicate the feasibility and efficiency of the GA-FSFclustering algorithm.
引用
收藏
页码:769 / 772
页数:4
相关论文
共 50 条
  • [21] Feature Selection with High-Dimensional Imbalanced Data
    Van Hulse, Jason
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    Wald, Randall
    2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 507 - 514
  • [22] Feature selection for high-dimensional temporal data
    Tsagris, Michail
    Lagani, Vincenzo
    Tsamardinos, Ioannis
    BMC BIOINFORMATICS, 2018, 19
  • [23] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [24] Feature Selection for Clustering on High Dimensional Data
    Zeng, Hong
    Cheung, Yiu-ming
    PRICAI 2008: TRENDS IN ARTIFICIAL INTELLIGENCE, 2008, 5351 : 913 - 922
  • [25] Using Feature Clustering for GP-Based Feature Construction on High-Dimensional Data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    GENETIC PROGRAMMING, EUROGP 2017, 2017, 10196 : 210 - 226
  • [26] High-dimensional data clustering using k-means subspace feature selection
    Wang, Xiao-Dong
    Chen, Rung-Ching
    Yan, Fei
    Journal of Network Intelligence, 2019, 4 (03): : 80 - 87
  • [27] A GA-Based Wrapper Feature Selection for Animal Breeding Data Mining
    Unold, Olgierd
    Dobrowolski, Maciej
    Maciejewski, Henryk
    Skrobanek, Pawel
    Walkowicz, Ewa
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, 2012, 7209 : 200 - 209
  • [28] Bayesian variable selection in clustering high-dimensional data
    Tadesse, MG
    Sha, N
    Vannucci, M
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (470) : 602 - 617
  • [29] Feature selection for modular GA-based classification
    Zhu, FM
    Guan, S
    APPLIED SOFT COMPUTING, 2004, 4 (04) : 381 - 393
  • [30] Neighborhood Component Feature Selection for High-Dimensional Data
    Yang, Wei
    Wang, Kuanquan
    Zuo, Wangmeng
    JOURNAL OF COMPUTERS, 2012, 7 (01) : 161 - 168