Clustering with Partition Level Side Information

被引:27
|
作者
Liu, Hongfu [1 ]
Fu, Yun [1 ,2 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Northeastern Univ, Coll Comp & Informat Sci, Boston, MA 02115 USA
关键词
Clustering; Partition level side information; K-means; Utility function; ALGORITHMS;
D O I
10.1109/ICDM.2015.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Constrained clustering uses pre-given knowledge to improve the clustering performance. Among existing literature, researchers usually focus on Must-Link and Cannot-Link pairwise constraints. However, pairwise constraints not only disobey the way we make decisions, but also suffer from the vulnerability of noisy constraints and the order of constraints. In light of this, we use partition level side information instead of pairwise constraints to guide the process of clustering. Compared with pairwise constraints, partition level side information keeps the consistency within partial structure and avoids self-contradictory and the impact of constraints order. Generally speaking, only small part of the data instances are given labels by human workers, which are used to supervise the procedure of clustering. Inspired by the success of ensemble clustering, we aim to find a clustering solution which captures the intrinsic structure from the data itself, and agrees with the partition level side information as much as possible. Then we derive the objective function and equivalently transfer it into a K-meanlike optimization problem. Extensive experiments on several real-world datasets demonstrate the effectiveness and efficiency of our method compared to pairwise constrained clustering and consensus clustering, which verifies the superiority of partition level side information to pairwise constraints. Besides, our method has high robustness to noisy side information.
引用
收藏
页码:877 / 882
页数:6
相关论文
共 50 条
  • [41] The Mean Partition Theorem in consensus clustering
    Jain, Brijnesh J.
    PATTERN RECOGNITION, 2018, 79 : 427 - 439
  • [42] Bayesian clustering and product partition models
    Quintana, FA
    Iglesias, PL
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2003, 65 : 557 - 574
  • [43] Feature selection based on partition clustering
    Liu, Shuang
    Zhao, Qiang
    Wu, Xiang
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2014, 18 (02) : 135 - 142
  • [44] Robust Clustering with Topological Graph Partition
    WANG Shuliang
    LI Qi
    YUAN Hanning
    GENG Jing
    DAI Tianru
    DENG Chenwei
    ChineseJournalofElectronics, 2019, 28 (01) : 76 - 84
  • [45] An Improved Clustering Algorithm Based on Partition
    Ma, Jiajun
    Shi, Yuliang
    2016 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE AND INTERNET TECHNOLOGY (CII 2016), 2016, : 129 - 133
  • [46] A NEW LOOP PARTITION METHOD - CLUSTERING
    TSENG, SY
    KING, CT
    TANG, CY
    IFIP TRANSACTIONS A-COMPUTER SCIENCE AND TECHNOLOGY, 1993, 23 : 53 - 64
  • [47] Improving probabilities in a fuzzy clustering partition
    Soto, J.
    Flores-Sintas, A.
    Palarea-Albaladejo, J.
    FUZZY SETS AND SYSTEMS, 2008, 159 (04) : 406 - 421
  • [48] Robust Clustering with Topological Graph Partition
    Wang Shuliang
    Li Qi
    Yuan Hanning
    Geng Jing
    Dai Tianru
    Deng Chenwei
    CHINESE JOURNAL OF ELECTRONICS, 2019, 28 (01) : 76 - 84
  • [49] A level inference method for aggregated information of objects based on clustering analysis
    Cao, Li-Feng
    Chen, Xing-Yuan
    Du, Xue-Hui
    Xia, Chun-Tao
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2012, 34 (06): : 1432 - 1437
  • [50] On the partition of India - The other side of silence: Voices from the partition of India
    Finkelstein, SI
    CURRENT HISTORY, 2001, 100 (645): : 187 - 187