Clustering with Partition Level Side Information

被引:27
|
作者
Liu, Hongfu [1 ]
Fu, Yun [1 ,2 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Northeastern Univ, Coll Comp & Informat Sci, Boston, MA 02115 USA
关键词
Clustering; Partition level side information; K-means; Utility function; ALGORITHMS;
D O I
10.1109/ICDM.2015.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Constrained clustering uses pre-given knowledge to improve the clustering performance. Among existing literature, researchers usually focus on Must-Link and Cannot-Link pairwise constraints. However, pairwise constraints not only disobey the way we make decisions, but also suffer from the vulnerability of noisy constraints and the order of constraints. In light of this, we use partition level side information instead of pairwise constraints to guide the process of clustering. Compared with pairwise constraints, partition level side information keeps the consistency within partial structure and avoids self-contradictory and the impact of constraints order. Generally speaking, only small part of the data instances are given labels by human workers, which are used to supervise the procedure of clustering. Inspired by the success of ensemble clustering, we aim to find a clustering solution which captures the intrinsic structure from the data itself, and agrees with the partition level side information as much as possible. Then we derive the objective function and equivalently transfer it into a K-meanlike optimization problem. Extensive experiments on several real-world datasets demonstrate the effectiveness and efficiency of our method compared to pairwise constrained clustering and consensus clustering, which verifies the superiority of partition level side information to pairwise constraints. Besides, our method has high robustness to noisy side information.
引用
收藏
页码:877 / 882
页数:6
相关论文
共 50 条
  • [1] Clustering with Instance and Attribute Level Side Information
    Wang, Jinlong
    Wu, Shunyao
    Li, Gang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2010, 3 (06) : 770 - 785
  • [2] Partition Level Constrained Clustering
    Liu, Hongfu
    Tao, Zhiqiang
    Fu, Yun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (10) : 2469 - 2483
  • [3] Partition level multiview subspace clustering
    Kang, Zhao
    Zhao, Xinjia
    Peng, Chong
    Zhu, Hongyuan
    Zhou, Joey Tianyi
    Peng, Xi
    Chen, Wenyu
    Xu, Zenglin
    NEURAL NETWORKS, 2020, 122 : 279 - 288
  • [4] On Text Clustering with Side Information
    Aggarwal, Charu C.
    Zhao, Yuchen
    Yu, Philip S.
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 894 - 904
  • [5] Semisupervised Fuzzy Clustering With Partition Information of Subsets
    Mei, Jian-Ping
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2019, 27 (09) : 1726 - 1737
  • [6] Query Complexity of Clustering with Side Information
    Mazumdar, Arya
    Saha, Barna
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [7] Internet traffic clustering with side information
    Wang, Yu
    Xiang, Yang
    Zhang, Jun
    Zhou, Wanlei
    Xie, Bailin
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2014, 80 (05) : 1021 - 1036
  • [8] Universal Joint Image Clustering and Registration using Partition Information
    Raman, Ravi Kiran
    Varshney, Lav R.
    2017 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2017, : 2168 - 2172
  • [9] Co-Clustering with Side Information for Text Mining
    Thomas, Ramya Elizabeth
    Khan, Shamsuddin S.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA MINING AND ADVANCED COMPUTING (SAPIENCE), 2016, : 105 - 108