Using instance-level constraints in agglomerative hierarchical clustering: theoretical and empirical results

被引:42
|
作者
Davidson, Ian [1 ]
Ravi, S. S. [2 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
[2] SUNY Albany, Dept Comp Sci, Albany, NY 12222 USA
基金
美国国家科学基金会;
关键词
Clustering; Constrained clustering; Semi-supervised learning;
D O I
10.1007/s10618-008-0103-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering with constraints is a powerful method that allows users to specify background knowledge and the expected cluster properties. Significant work has explored the incorporation of instance-level constraints into non-hierarchical clustering but not into hierarchical clustering algorithms. In this paper we present a formal complexity analysis of the problem and show that constraints can be used to not only improve the quality of the resultant dendrogram but also the efficiency of the algorithms. This is particularly important since many agglomerative style algorithms have running times that are quadratic ( or faster growing) functions of the number of instances to be clustered. We present several bounds on the improvement in the running times of algorithms obtainable using constraints.
引用
收藏
页码:257 / 282
页数:26
相关论文
共 50 条
  • [21] Effective semi-supervised document clustering via active learning with instance-level constraints
    Zhao, Weizhong
    He, Qing
    Ma, Huifang
    Shi, Zhongzhi
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (03) : 569 - 587
  • [22] Integrating Instance-level and Attribute-level Knowledge into Document Clustering
    Wang, Jinlong
    Wu, Shunyao
    Li, Gang
    Wei, Zhe
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2011, 8 (03) : 635 - 651
  • [23] Enhancing instance-level constrained clustering through differential evolution
    Gonzalez-Almagro, German
    Luengo, Julian
    Cano, Jose-Ramon
    Garcia, Salvador
    APPLIED SOFT COMPUTING, 2021, 108
  • [24] Semi-supervised Discriminant Analyze with Instance-Level Constraints
    Gong, Yun-Chao
    Chen, Chuanliana
    Shen, Min
    Fu, Zengmei
    HPCC 2008: 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2008, : 801 - +
  • [25] Clustering trees with instance level constraints
    Struyf, Jan
    Dzeroski, Saso
    MACHINE LEARNING: ECML 2007, PROCEEDINGS, 2007, 4701 : 359 - +
  • [26] Path-based similarity with instance-level constraints for SemiBoost
    Zhang, Xiangrong
    Yu, Jianshen
    Wang, Ting
    Hou, Biao
    Jiao, L. C.
    MIPPR 2013: PATTERN RECOGNITION AND COMPUTER VISION, 2013, 8919
  • [27] Constrained Agglomerative Hierarchical Software Clustering with Hard and Soft Constraints
    Chong, Chun Yong
    Lee, Sai Peck
    ENASE 2015 - PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, 2015, : 177 - 188
  • [28] Comparing Different Methods of Agglomerative Hierarchical Clustering with Pairwise Constraints
    Takumi, Satoshi
    Miyamoto, Sadaaki
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1545 - 1550
  • [29] Urdu ligature recognition using multi-level agglomerative hierarchical clustering
    Khan, Naila Habib
    Adnan, Awais
    Basar, Sadia
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (01): : 503 - 514
  • [30] Urdu ligature recognition using multi-level agglomerative hierarchical clustering
    Naila Habib Khan
    Awais Adnan
    Sadia Basar
    Cluster Computing, 2018, 21 : 503 - 514