On constructing an optimal consensus clustering from multiple clusterings

被引:7
|
作者
Berman, Piotr [1 ]
DasGupta, Bhaskar
Kao, Ming-Yang
Wang, Jie
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[3] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
[4] Univ Massachusetts, Dept Comp Sci, Lowell, MA 01854 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
computational complexity; approximation algorithms; consensus clustering;
D O I
10.1016/j.ipl.2007.06.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computing a suitable measure of consensus among several clusterings on the same data is an important problem that arises in several areas such as computational biology and data mining. In this paper, we formalize a set-theoretic model for computing such a similarity measure. Roughly speaking, in this model we have k > 1 partitions (clusters) of the same data set each containing the same number of sets and the goal is to align the sets in each partition to minimize a similarity measure. For k = 2, a polynomial-time solution was proposed by Gusfield (Information Processing Letters 82 (2002) 159-164). In this paper, we show that the problem is MAX-SNP-hard for k = 3 even if each partition in each cluster contains no more than 2 elements and provide a 2-2/k-approximation algorithm for the problem for any k. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 145
页数:9
相关论文
共 50 条
  • [41] A framework to uncover multiple alternative clusterings
    Xuan Hong Dang
    James Bailey
    Machine Learning, 2015, 98 : 7 - 30
  • [42] Maximum likelihood combination of multiple clusterings
    Hu, Tianming
    Yu, Ying
    Xiong, Jinzhi
    Sung, Sam Yuan
    PATTERN RECOGNITION LETTERS, 2006, 27 (13) : 1457 - 1464
  • [43] Time optimal consensus tracking with multiple leaders
    Chaudhari, Aditya
    Chakraborty, Debraj
    INTERNATIONAL JOURNAL OF CONTROL, 2021, 94 (08) : 2282 - 2295
  • [44] Multiple clusterings of heterogeneous information networks
    Shaowei Wei
    Guoxian Yu
    Jun Wang
    Carlotta Domeniconi
    Xiangliang Zhang
    Machine Learning, 2021, 110 : 1505 - 1526
  • [45] Exploring Multiple Clusterings in Attributed Graphs
    Guedes, Gustavo Paiva
    Bezerra, Eduardo
    Ogasawara, Eduardo
    Xexeo, Geraldo
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 915 - 918
  • [46] COCA: Constructing optimal clustering architecture to maximize sensor network lifetime
    Li, Huan
    Liu, Yanlei
    Chen, Weifeng
    Jia, Weijia
    Li, Bing
    Xiong, Junwu
    COMPUTER COMMUNICATIONS, 2013, 36 (03) : 256 - 268
  • [47] A Diversified Attention Model for Interpretable Multiple Clusterings
    Ren, Liangrui
    Yu, Guoxian
    Wang, Jun
    Liu, Lei
    Domeniconi, Carlotta
    Zhang, Xiangliang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 8852 - 8864
  • [48] On combining multiple clusterings: an overview and a new perspective
    Tao Li
    Mitsunori Ogihara
    Sheng Ma
    Applied Intelligence, 2010, 33 : 207 - 219
  • [49] Optimal Neighborhood Kernel Clustering with Multiple Kernels
    Liu, Xinwang
    Zhou, Sihang
    Wang, Yueqing
    Li, Miaomiao
    Dou, Yong
    Zhu, En
    Yin, Jianping
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2266 - 2272
  • [50] Obtaining better quality final clustering by merging a collection of clusterings
    Mimaroglu, Selim
    Erdil, Ertunc
    BIOINFORMATICS, 2010, 26 (20) : 2645 - 2646