Combining multiple weak clusterings

被引:0
|
作者
Topchy, A [1 ]
Jain, AK [1 ]
Punch, W [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data set can be clustered in many ways depending on the clustering algorithm employed, parameter settings used and other factors. Can multiple clusterings be combined so that the final partitioning of data provides better clustering? The answer depends on the quality of clusterings to be combined as well as the properties of the fusion method. First, we introduce a unified representation for multiple clusterings and formulate the corresponding categorical clustering problem. As a result, we show that the consensus function is related to the classical intra-class variance criterion using the generalized mutual information definition. Second, we show the efficacy of combining partitions generated by weak clustering algorithms that use data projections and random data splits. A simple explanatory model is offered for the behavior of combinations of such weak clustering components. We analyze the combination accuracy as a Junction of parameters controlling the power and resolution of component partitions as well as the learning dynamics vs. the number of clusterings involved. Finally, some empirical studies compare the effectiveness of several consensus functions.
引用
收藏
页码:331 / 338
页数:8
相关论文
共 50 条
  • [1] CLICOM: Cliques for combining multiple clusterings
    Mimaroglu, Selim
    Yagci, Murat
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (02) : 1889 - 1901
  • [2] Combining multiple clusterings by soft correspondence
    Long, B
    Zhang, ZF
    Yu, PS
    FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 282 - 289
  • [3] On combining multiple clusterings: an overview and a new perspective
    Tao Li
    Mitsunori Ogihara
    Sheng Ma
    Applied Intelligence, 2010, 33 : 207 - 219
  • [4] Combining multiple clusterings for protein structure prediction
    Sakar, C. Okan
    Kursun, Olcay
    Seker, Huseyin
    Gurgen, Fikret
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2014, 10 (02) : 162 - 174
  • [5] On combining multiple clusterings: an overview and a new perspective
    Li, Tao
    Ogihara, Mitsunori
    Ma, Sheng
    APPLIED INTELLIGENCE, 2010, 33 (02) : 207 - 219
  • [6] Combining multiple clusterings using evidence accumulation
    Fred, ALN
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (06) : 835 - 850
  • [7] Combining multiple clusterings using similarity graph
    Mimaroglu, Selim
    Erdil, Ertunc
    PATTERN RECOGNITION, 2011, 44 (03) : 694 - 703
  • [8] Consensus Methods for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (05) : 1026 - 1034
  • [9] Combining multiple clusterings using fast simulated annealing
    Lu, Zhiwu
    Peng, Yuxin
    Ip, Horace H. S.
    PATTERN RECOGNITION LETTERS, 2011, 32 (15) : 1956 - 1961
  • [10] Combining multiple clusterings via k-modes algorithm
    Luo, Huilan
    Kong, Fansheng
    Li, Yixiao
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 308 - 315