A Global K-modes Algorithm for Clustering Categorical Data

被引:0
|
作者
Bai Tian [1 ,2 ]
Kulikowski, C. A. [2 ]
Gong Leiguang [3 ]
Yang Bin [1 ]
Huang Lan [1 ]
Zhou Chunguang [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Rutgers State Univ, Dept Comp Sci, New Brunswick, NJ 08903 USA
[3] IBM Thomas J Watson Res Ctr, Hawthorne, NJ USA
来源
CHINESE JOURNAL OF ELECTRONICS | 2012年 / 21卷 / 03期
基金
中国国家自然科学基金;
关键词
Categorical data; Clustering; Data mining; K-modes algorithm;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a new Global k-modes (GKM) algorithm is proposed for clustering categorical data. The new method randomly selects a sufficiently large number of initial modes to account for the global distribution of the data set, and then progressively eliminates the redundant modes using an iterative optimization process with an elimination criterion function. Systematic experiments were carried out with data from the UCI Machine learning repository. The results and a comparative evaluation show a high performance and consistency of the proposed method, which achieves significant improvement compared to other well-known k-modes-type algorithms in terms of clustering accuracy.
引用
收藏
页码:460 / 465
页数:6
相关论文
共 50 条
  • [1] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452
  • [2] A genetic k-modes algorithm for clustering categorical data
    Gan, GJ
    Yang, ZJ
    Wu, JH
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 195 - 202
  • [3] A weighting k-modes algorithm for subspace clustering of categorical data
    Cao, Fuyuan
    Liang, Jiye
    Li, Deyu
    Zhao, Xingwang
    NEUROCOMPUTING, 2013, 108 : 23 - 30
  • [4] A genetic fuzzy k-Modes algorithm for clustering categorical data
    Gan, G.
    Wu, J.
    Yang, Z.
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1615 - 1620
  • [5] Initialization of K-Modes Clustering for Categorical Data
    Li Tao-ying
    Chen Yan
    Jin Zhi-hong
    Li Ye
    2013 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING (ICMSE), 2013, : 107 - 112
  • [6] An efficient k-modes algorithm for clustering categorical datasets
    Dorman, Karin S.
    Maitra, Ranjan
    STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (01) : 83 - 97
  • [7] Clustering categorical data: Soft rounding k-modes
    Gavva, Surya Teja
    Karthik, C. S.
    Punna, Sharath
    INFORMATION AND COMPUTATION, 2024, 296
  • [8] Clustering of Categorical Data Using Intuitionistic Fuzzy k-modes
    Mehta, Darshan
    Tripathy, B. K.
    PROCEEDINGS OF SIXTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2016), VOL 1, 2017, 546 : 254 - 263
  • [9] Categorical data clustering: 25 years beyond K-modes
    Dinh, Tai
    Wong, Hauchi
    Fournier-Viger, Philippe
    Lisik, Daniil
    Ha, Minh-Quyet
    Dam, Hieu-Chi
    Huynh, Van-Nam
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 272
  • [10] Genetic intuitionistic weighted fuzzy k-modes algorithm for categorical data
    Kuo, R. J.
    Thi Phuong Quyen Nguyen
    NEUROCOMPUTING, 2019, 330 : 116 - 126