Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K

被引:0
|
作者
Timothy E. Sweeney
Albert C. Chen
Olivier Gevaert
机构
[1] Institute for Immunity,Department of Statistics
[2] Transplantation and Infection,undefined
[3] Stanford University,undefined
[4] Biomedical Informatics Research,undefined
[5] Stanford University,undefined
[6] Stanford,undefined
[7] CA 94305,undefined
[8] United States.,undefined
[9] Stanford University,undefined
[10] Stanford,undefined
[11] CA 94305,undefined
[12] United States.,undefined
来源
Scientific Reports | / 5卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of ‘dark art’, with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets and is freely available as a package for R.
引用
收藏
相关论文
共 50 条
  • [1] COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K
    Sweeney, Timothy E.
    Chen, Albert C.
    Gevaert, Olivier
    SCIENTIFIC REPORTS, 2015, 5
  • [2] Clustering Algorithms with Automatic Selection of Cluster Number
    Ng, Michael
    2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2008, : 11 - 11
  • [3] Cluster selection in divisive clustering algorithms
    Savaresi, SM
    Boley, DL
    Bittanti, S
    Gazzaniga, G
    PROCEEDINGS OF THE SECOND SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2002, : 299 - 314
  • [4] NSS-AKmeans: An Agglomerative Fuzzy K-Means Clustering Method with Automatic Selection of Cluster Number
    Zhang, Yanfeng
    Xu, Xiaofei
    Ye, Yunming
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 2, 2010, : 32 - 38
  • [5] Categorical Data Clustering with Automatic Selection of Cluster Number
    Liao, Hai-Yong
    Ng, Michael K.
    FUZZY INFORMATION AND ENGINEERING, 2009, 1 (01) : 5 - 25
  • [6] Medoid Silhouette clustering with automatic cluster number selection
    Lenssen, Lars
    Schubert, Erich
    INFORMATION SYSTEMS, 2024, 120
  • [7] Robust Algorithms for Online k-means Clustering
    Bhaskara, Aditya
    Ruwanpathirana, Aravinda Kanchana
    ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 148 - 173
  • [8] A Clustering Ensemble Method Based on Cluster Selection and Cluster Splitting
    Tang, Yuyang
    Liu, Xiabi
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 54 - 58
  • [9] Optimizing Number of Cluster Heads in Wireless Sensor Networks for Clustering Algorithms
    Pal, Vipin
    Singh, Girdhari
    Yadav, R. P.
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2012), 2014, 236 : 1267 - 1274
  • [10] Content Aided Clustering and Cluster Head Selection Algorithms in Vehicular Networks
    Zhang, Kai
    Wang, Jingjing
    Jiang, Chunxiao
    Quek, Tony Q. S.
    Ren, Yong
    2017 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2017,