Clustering Mixed Numeric and Categorical Data With Cuckoo Search

被引:14
|
作者
Ji, Jinchao [1 ,2 ,3 ,4 ]
Pang, Wei [5 ,6 ]
Li, Zairong [7 ]
He, Fei [1 ,2 ,3 ,4 ]
Feng, Guozhong [1 ,2 ,3 ]
Zhao, Xiaowei [1 ,2 ,3 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Northeast Normal Univ, Inst Computat Biol, Changchun 130117, Jilin, Peoples R China
[3] Northeast Normal Univ, Key Lab Intelligent Informat Proc Jilin Univ, Changchun 130117, Jilin, Peoples R China
[4] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Jilin, Peoples R China
[5] Heriot Watt Univ, Sch Math & Comp Sci, Edinburgh EH14 4AS, Midlothian, Scotland
[6] Shaanxi Key Lab Complex Syst Control & Intelligen, Xian 710048, Shaanxi, Peoples R China
[7] Northeast Normal Univ, Sch Media Sci, Changchun 130117, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Data clustering; cuckoo search; mixed data; numeric and categorical attributes; ALGORITHM;
D O I
10.1109/ACCESS.2020.2973216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering analysis, as an important technique in data mining, aims to identify the nature groups or clusters of data objects in the attribute space. Data objects in real-world applications are commonly described by both numeric and categorical attributes. In this research, considering that the partitional clustering algorithms designed for this type of mixed data are prone to get trapped into local optima and the cuckoo search approach is efficient in solving global optimization problems, we propose CCS-K-Prototypes, a novel partitional Clustering algorithm based on Cuckoo Search and K-Prototypes, for clustering mixed numeric and categorical data. To deal with different types of attributes, we develop a novel representation for candidate solutions, and suggest two formulas for the cuckoo to search for the potential solution around the existing solutions or in the entire attribute space. Finally, the performance of the proposed algorithm is assessed by a series of experiments on five benchmark datasets.
引用
收藏
页码:30988 / 31003
页数:16
相关论文
共 50 条
  • [31] A hybrid decision tree algorithm for mixed numeric and categorical data in regression analysis
    Kim, Kyoungok
    Hong, Jung-Sik
    PATTERN RECOGNITION LETTERS, 2017, 98 : 39 - 45
  • [32] A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional
    Chatzis, Sotirios P.
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (07) : 8684 - 8689
  • [33] Clustering categorical data sets using tabu search techniques
    Ng, MK
    Wong, JC
    PATTERN RECOGNITION, 2002, 35 (12) : 2783 - 2790
  • [35] Incremental clustering algorithm of mixed numerical and categorical data based on clustering ensemble
    Li, Tao-Ying
    Chen, Yan
    Zhang, Jin-Song
    Qin, Sheng-Jun
    Kongzhi yu Juece/Control and Decision, 2012, 27 (04): : 603 - 608
  • [36] Efficient algorithms based on the k-means and Chaotic League Championship Algorithm for numeric, categorical, and mixed-type data clustering
    Wangchamhan, Tanachapong
    Chiewchanwattana, Sirapat
    Sunat, Khamron
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 90 : 146 - 167
  • [37] A study on a fuzzy clustering for mixed numerical and categorical incomplete data
    Furukawa, Takashi
    Ohnishi, Shin-ichi
    Yamanoi, Takahiro
    2013 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY 2013), 2013, : 425 - 428
  • [38] A Two-Step Method for Clustering Mixed Categroical and Numeric Data
    Shih, Ming-Yi
    Jheng, Jar-Wen
    Lai, Lien-Fu
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2010, 13 (01): : 11 - 19
  • [39] Categorical Data Clustering Using Harmony Search Algorithm for Healthcare Datasets
    Sharma, Abha
    Kumar, Pushpendra
    Babulal, Kanojia Sindhuben
    Obaid, Ahmed J.
    Patel, Harshita
    INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS, 2022, 13 (04)
  • [40] Visualized Analysis of Mixed Numeric and Categorical Data via Extended Self-Organizing Map
    Hsu, Chung-Chian
    Lin, Shu-Han
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (01) : 72 - 86