Combining multiple weak clusterings

被引:0
|
作者
Topchy, A [1 ]
Jain, AK [1 ]
Punch, W [1 ]
机构
[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data set can be clustered in many ways depending on the clustering algorithm employed, parameter settings used and other factors. Can multiple clusterings be combined so that the final partitioning of data provides better clustering? The answer depends on the quality of clusterings to be combined as well as the properties of the fusion method. First, we introduce a unified representation for multiple clusterings and formulate the corresponding categorical clustering problem. As a result, we show that the consensus function is related to the classical intra-class variance criterion using the generalized mutual information definition. Second, we show the efficacy of combining partitions generated by weak clustering algorithms that use data projections and random data splits. A simple explanatory model is offered for the behavior of combinations of such weak clustering components. We analyze the combination accuracy as a Junction of parameters controlling the power and resolution of component partitions as well as the learning dynamics vs. the number of clusterings involved. Finally, some empirical studies compare the effectiveness of several consensus functions.
引用
收藏
页码:331 / 338
页数:8
相关论文
共 50 条
  • [31] Adaptive Cumulative Voting-Based Aggregation Algorithm for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    Hentabli, Hamza
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT II, 2013, 7803 : 305 - 314
  • [32] Multiple Co-Clusterings
    Wang, Xing
    Yu, Guoxian
    Domeniconi, Carlotta
    Wang, Jun
    Yu, Zhiwen
    Zhang, Zili
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1308 - 1313
  • [33] Combining Multiple Individual Clusterings of Chemical Structures Using Cluster-Based Similarity Partitioning Algorithm
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    Hentabli, Hamza
    ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS, 2012, 322 : 276 - +
  • [34] A framework to uncover multiple alternative clusterings
    Dang, Xuan Hong
    Bailey, James
    MACHINE LEARNING, 2015, 98 (1-2) : 7 - 30
  • [35] Multiple clusterings: Recent advances and perspectives
    Yu, Guoxian
    Ren, Liangrui
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    COMPUTER SCIENCE REVIEW, 2024, 52
  • [36] Improving Supervised Learning with Multiple Clusterings
    Wemmert, Cedric
    Forestier, Germain
    Derivaux, Sebastien
    APPLICATIONS OF SUPERVISED AND UNSUPERVISED ENSEMBLE METHODS, 2009, 245 : 135 - 149
  • [37] Multiple clusterings of heterogeneous information networks
    Wei, Shaowei
    Yu, Guoxian
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    MACHINE LEARNING, 2021, 110 (06) : 1505 - 1526
  • [38] Are clusterings of multiple data views independent?
    Gao, Lucy L.
    Bien, Jacob
    Witten, Daniela
    BIOSTATISTICS, 2020, 21 (04) : 692 - 708
  • [39] A framework to uncover multiple alternative clusterings
    Xuan Hong Dang
    James Bailey
    Machine Learning, 2015, 98 : 7 - 30
  • [40] Maximum likelihood combination of multiple clusterings
    Hu, Tianming
    Yu, Ying
    Xiong, Jinzhi
    Sung, Sam Yuan
    PATTERN RECOGNITION LETTERS, 2006, 27 (13) : 1457 - 1464