Clustering Mixed Numeric and Categorical Data With Cuckoo Search

被引:14
|
作者
Ji, Jinchao [1 ,2 ,3 ,4 ]
Pang, Wei [5 ,6 ]
Li, Zairong [7 ]
He, Fei [1 ,2 ,3 ,4 ]
Feng, Guozhong [1 ,2 ,3 ]
Zhao, Xiaowei [1 ,2 ,3 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Northeast Normal Univ, Inst Computat Biol, Changchun 130117, Jilin, Peoples R China
[3] Northeast Normal Univ, Key Lab Intelligent Informat Proc Jilin Univ, Changchun 130117, Jilin, Peoples R China
[4] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Jilin, Peoples R China
[5] Heriot Watt Univ, Sch Math & Comp Sci, Edinburgh EH14 4AS, Midlothian, Scotland
[6] Shaanxi Key Lab Complex Syst Control & Intelligen, Xian 710048, Shaanxi, Peoples R China
[7] Northeast Normal Univ, Sch Media Sci, Changchun 130117, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Data clustering; cuckoo search; mixed data; numeric and categorical attributes; ALGORITHM;
D O I
10.1109/ACCESS.2020.2973216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering analysis, as an important technique in data mining, aims to identify the nature groups or clusters of data objects in the attribute space. Data objects in real-world applications are commonly described by both numeric and categorical attributes. In this research, considering that the partitional clustering algorithms designed for this type of mixed data are prone to get trapped into local optima and the cuckoo search approach is efficient in solving global optimization problems, we propose CCS-K-Prototypes, a novel partitional Clustering algorithm based on Cuckoo Search and K-Prototypes, for clustering mixed numeric and categorical data. To deal with different types of attributes, we develop a novel representation for candidate solutions, and suggest two formulas for the cuckoo to search for the potential solution around the existing solutions or in the entire attribute space. Finally, the performance of the proposed algorithm is assessed by a series of experiments on five benchmark datasets.
引用
收藏
页码:30988 / 31003
页数:16
相关论文
共 50 条
  • [1] A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
    Ohn Mar San
    Van-Nam Huynh
    Yoshiteru Nakamori
    JournalofSystemsScienceandComplexity, 2003, (04) : 562 - 571
  • [2] Algorithm for fuzzy clustering of mixed data with numeric and categorical attributes
    Ahmad, A
    Dey, L
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 561 - 572
  • [3] A SURVEY ON CLUSTERING METHODS FOR NUMERIC, CATEGORICAL AND MIXED VARIABLES DATA
    Nisha
    Hooda, B. K.
    INTERNATIONAL JOURNAL OF AGRICULTURAL AND STATISTICAL SCIENCES, 2022, 18 (02): : 675 - 679
  • [4] Clustering mixed numeric and categorical data with artificial bee colony strategy
    Ji, Jinchao
    Chen, Yongbing
    Feng, Guozhong
    Zhao, Xiaowei
    He, Fei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (02) : 1521 - 1530
  • [5] A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data
    Ji, Jinchao
    Li, Ruonan
    Pang, Wei
    He, Fei
    Feng, Guozhong
    Zhao, Xiaowei
    IEEE ACCESS, 2021, 9 : 24913 - 24924
  • [6] Entropy based clustering of data streams with mixed numeric and categorical values
    Wang, Shuyun
    Fan, Yingjie
    Zhang, Chenghong
    Xu, HeXiang
    Hao, Xiulan
    Hu, Yunfa
    7TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE IN CONJUNCTION WITH 2ND IEEE/ACIS INTERNATIONAL WORKSHOP ON E-ACTIVITY, PROCEEDINGS, 2008, : 140 - +
  • [7] Clustering algorithm for incomplete data sets with mixed numeric and categorical attributes
    Sen, Wu
    Hong, Chen
    Xiaodong, Feng
    International Journal of Database Theory and Application, 2013, 6 (05): : 95 - 104
  • [8] A k-mean clustering algorithm for mixed numeric and categorical data
    Ahmad, Amir
    Dey, Lipika
    DATA & KNOWLEDGE ENGINEERING, 2007, 63 (02) : 503 - 527
  • [9] An improved k-prototypes clustering algorithm for mixed numeric and categorical data
    Ji, Jinchao
    Bai, Tian
    Zhou, Chunguang
    Ma, Chao
    Wang, Zhe
    NEUROCOMPUTING, 2013, 120 : 590 - 596
  • [10] Optimization of the Numeric and Categorical Attribute Weights in KAMILA Mixed Data Clustering Algorithm
    Martarelli, Nadia Junqueira
    Nagano, Marcelo Seido
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 20 - 27