Finding multiple stable clusterings

被引:0
|
作者
Juhua Hu
Qi Qian
Jian Pei
Rong Jin
Shenghuo Zhu
机构
[1] Simon Fraser University,School of Computing Science
[2] Alibaba Group,undefined
来源
关键词
Multi-clustering; Clustering stability; Laplacian eigengap; Feature subspace;
D O I
暂无
中图分类号
学科分类号
摘要
Multi-clustering, which tries to find multiple independent ways to partition a data set into groups, has enjoyed many applications, such as customer relationship management, bioinformatics and healthcare informatics. This paper addresses two fundamental questions in multi-clustering: How to model quality of clusterings and how to find multiple stable clusterings (MSC). We introduce to multi-clustering the notion of clustering stability based on Laplacian eigengap, which was originally used by the regularized spectral learning method for similarity matrix learning. We mathematically prove that the larger the eigengap, the more stable the clustering. Furthermore, we propose a novel multi-clustering method MSC. An advantage of our method comparing to the state-of-the-art multi-clustering methods is that our method can provide users a feature subspace to understand each clustering solution. Another advantage is that MSC does not need users to specify the number of clusters and the number of alternative clusterings, which is usually difficult for users without any guidance. Our method can heuristically estimate the number of stable clusterings in a data set. We also discuss a practical way to make MSC applicable to large-scale data. We report an extensive empirical study that clearly demonstrates the effectiveness of our method.
引用
收藏
页码:991 / 1021
页数:30
相关论文
共 50 条
  • [21] Maximum likelihood combination of multiple clusterings
    Hu, Tianming
    Yu, Ying
    Xiong, Jinzhi
    Sung, Sam Yuan
    PATTERN RECOGNITION LETTERS, 2006, 27 (13) : 1457 - 1464
  • [22] Multiple clusterings of heterogeneous information networks
    Shaowei Wei
    Guoxian Yu
    Jun Wang
    Carlotta Domeniconi
    Xiangliang Zhang
    Machine Learning, 2021, 110 : 1505 - 1526
  • [23] Exploring Multiple Clusterings in Attributed Graphs
    Guedes, Gustavo Paiva
    Bezerra, Eduardo
    Ogasawara, Eduardo
    Xexeo, Geraldo
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 915 - 918
  • [24] A Diversified Attention Model for Interpretable Multiple Clusterings
    Ren, Liangrui
    Yu, Guoxian
    Wang, Jun
    Liu, Lei
    Domeniconi, Carlotta
    Zhang, Xiangliang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 8852 - 8864
  • [25] On combining multiple clusterings: an overview and a new perspective
    Tao Li
    Mitsunori Ogihara
    Sheng Ma
    Applied Intelligence, 2010, 33 : 207 - 219
  • [26] Combining multiple clusterings for protein structure prediction
    Sakar, C. Okan
    Kursun, Olcay
    Seker, Huseyin
    Gurgen, Fikret
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2014, 10 (02) : 162 - 174
  • [27] On combining multiple clusterings: an overview and a new perspective
    Li, Tao
    Ogihara, Mitsunori
    Ma, Sheng
    APPLIED INTELLIGENCE, 2010, 33 (02) : 207 - 219
  • [28] Combining multiple clusterings using evidence accumulation
    Fred, ALN
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (06) : 835 - 850
  • [29] SIMILARITY-BASED COMBINATION OF MULTIPLE CLUSTERINGS
    Hu, Tianming
    Xiong, Jinzhi
    Zheng, Gengzhong
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2005, 5 (03) : 351 - 369
  • [30] Combining multiple clusterings using similarity graph
    Mimaroglu, Selim
    Erdil, Ertunc
    PATTERN RECOGNITION, 2011, 44 (03) : 694 - 703