Semi-supervised Parameter-Free Divisive Hierarchical Clustering of Categorical Data

被引:0
|
作者
Xiong, Tengke [1 ]
Wang, Shengrui [1 ]
Mayers, Andre [1 ]
Monga, Ernest [2 ]
机构
[1] Univ Sherbrooke, Dept Comp Sci, Sherbrooke, PQ J1K 2R1, Canada
[2] Univ Sherbrooke, Dept Math, Sherbrooke, PQ J1K 2R1, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised clustering can yield considerable improvement over unsupervised clustering. Most existing semi-supervised clustering algorithms are non-hierarchical, derived from the k-means algorithm and designed for analyzing numeric data. Clustering categorical data is a challenging issue due to the lack of inherently meaningful similarity measure, and semi-supervised clustering in the categorical domain remains untouched. In this paper, we propose a novel semi-supervised divisive hierarchical algorithm for categorical data. Our algorithm is parameter-free, fully automatic and effective in taking advantage of instance-level constraint background knowledge to improve the quality of the resultant dendrogram. Experiments on real-life data demonstrate the promising performance of our algorithm.
引用
收藏
页码:265 / 276
页数:12
相关论文
共 50 条
  • [41] Developing ensemble clustering through similarity measures: A semi-supervised hierarchical clustering learning
    Wang, Dandan
    Li, Qi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (16):
  • [42] Kernel parameter optimization for semi-supervised fuzzy clustering with pairwise constraints
    Na, Wang
    Xia, Li
    CHINESE JOURNAL OF ELECTRONICS, 2008, 17 (02): : 297 - 300
  • [43] Machine learning integrated credibilistic semi supervised clustering for categorical data
    Sarkar, Jnanendra Prasad
    Saha, Indrajit
    Chakraborty, Sinjan
    Maulik, Ujjwal
    APPLIED SOFT COMPUTING, 2020, 86
  • [44] Towards Parameter-Free Clustering for Real-World Data
    Hou, Jian
    Yuan, Huaqiang
    Pelillo, Marcello
    PATTERN RECOGNITION, 2023, 134
  • [45] A Hybrid and Parameter-Free Clustering Algorithm for Large Data Sets
    Shao, Hengkang
    Zhang, Ping
    Chen, Xinye
    Li, Fang
    Du, Guanglong
    IEEE ACCESS, 2019, 7 : 24806 - 24818
  • [46] Integrative Parameter-Free Clustering of Data with Mixed Type Attributes
    Boehm, Christian
    Goebl, Sebastian
    Oswald, Annahita
    Plant, Claudia
    Plavinski, Michael
    Wackersreuther, Bianca
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I, PROCEEDINGS, 2010, 6118 : 38 - +
  • [47] Semi-supervised sparse representation collaborative clustering of incomplete data
    Tingquan Deng
    Jingyu Wang
    Qingwei Jia
    Ming Yang
    Applied Intelligence, 2023, 53 : 31077 - 31105
  • [48] Incremental semi-supervised clustering in a data stream with a flock of agents
    Bruneau, Pierrick
    Picarougne, Fabien
    Gelgon, Marc
    2009 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-5, 2009, : 3067 - 3074
  • [49] A semi-supervised approach to projected clustering with applications to microarray data
    Yip, Kevin Y.
    Cheung, Lin
    Cheung, David W.
    Jing, Liping
    Ng, Michael K.
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (03) : 229 - 259
  • [50] Data Stream Classification by Adaptive Semi-supervised Fuzzy Clustering
    Castellano, Giovanna
    Fanelli, Anna Maria
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 770 - 771