A new unsupervised feature selection algorithm using similarity-based feature clustering

被引:34
|
作者
Zhu, Xiaoyan [1 ]
Wang, Yu [1 ]
Li, Yingbin [1 ]
Tan, Yonghui [1 ]
Wang, Guangtao [2 ]
Song, Qinbao [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian, Shaanxi, Peoples R China
[2] JD AI Res, Mountain View, CA USA
基金
中国国家自然科学基金;
关键词
clustering; feature selection; feature similarity; CLASSIFICATION;
D O I
10.1111/coin.12192
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised feature selection is an important problem, especially for high-dimensional data. However, until now, it has been scarcely studied and the existing algorithms cannot provide satisfying performance. Thus, in this paper, we propose a new unsupervised feature selection algorithm using similarity-based feature clustering, Feature Selection-based Feature Clustering (FSFC). FSFC removes redundant features according to the results of feature clustering based on feature similarity. First, it clusters the features according to their similarity. A new feature clustering algorithm is proposed, which overcomes the shortcomings of K-means. Second, it selects a representative feature from each cluster, which contains most interesting information of features in the cluster. The efficiency and effectiveness of FSFC are tested upon real-world data sets and compared with two representative unsupervised feature selection algorithms, Feature Selection Using Similarity (FSUS) and Multi-Cluster-based Feature Selection (MCFS) in terms of runtime, feature compression ratio, and the clustering results of K-means. The results show that FSFC can not only reduce the feature space in less time, but also significantly improve the clustering performance of K-means.
引用
收藏
页码:2 / 22
页数:21
相关论文
共 50 条
  • [41] A unifying criterion for unsupervised clustering and feature selection
    Breaban, Mihaela
    Luchian, Henri
    PATTERN RECOGNITION, 2011, 44 (04) : 854 - 865
  • [42] Unsupervised Feature Selection with Joint Clustering Analysis
    An, Shuai
    Wang, Jun
    Wei, Jinmao
    Yang, Zhenglu
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1639 - 1648
  • [43] Unsupervised feature selection via discrete spectral clustering and feature weights
    Shang, Ronghua
    Kong, Jiarui
    Wang, Lujuan
    Zhang, Weitong
    Wang, Chao
    Li, Yangyang
    Jiao, Licheng
    NEUROCOMPUTING, 2023, 517 : 106 - 117
  • [44] Integration of dense subgraph finding with feature clustering for unsupervised feature selection
    Bandyopadhyay, Sanghamitra
    Bhadra, Tapas
    Mitra, Pabitra
    Maulik, Ujjwal
    PATTERN RECOGNITION LETTERS, 2014, 40 : 104 - 112
  • [45] An Evolutionary Attribute Clustering and Selection Method Based on Feature Similarity
    Hong, Tzung-Pei
    Wang, Po-Cheng
    Ting, Chuan-Kang
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [46] Unsupervised authorship attribution using feature selection and weighted cosine similarity
    Martin-del-Campo-Rodriguez, Carolina
    Sidorov, Grigori
    Batyrshin, Ildar
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4357 - 4367
  • [47] A new feature matching algorithm for image registration based on feature similarity
    Lv, Jin-jian
    Wen, Gong-jian
    Wang, Ji-yang
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 4, PROCEEDINGS, 2008, : 421 - 425
  • [48] A Novel Crowding Clustering Algorithm for Unsupervised and Supervised Filter Feature Selection Problem
    Ghanem, Khadoudja
    Layeb, Abdesslem
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [49] Balanced Spectral Clustering Algorithm Based on Feature Selection
    Luo, Qimin
    Lu, Guangquan
    Wen, Guoqiu
    Su, Zidong
    Liu, Xingyi
    Wei, Jian
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT II, 2022, 13088 : 356 - 367
  • [50] A novel feature selection approach based on clustering algorithm
    Moslehi, Fateme
    Haeri, Abdorrahman
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (03) : 581 - 604