A novel incremental conceptual hierarchical text clustering method using CFu-tree

被引:13
|
作者
Peng, Tao [1 ,2 ,3 ]
Liu, Lu [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[3] Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China
关键词
Text clustering; CFu-tree; Comparison Variation (CV); Incremental hierarchical clustering; EFFICIENT ALGORITHM;
D O I
10.1016/j.asoc.2014.11.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a data mining method, clustering, which is one of the most important tools in information retrieval, organizes data based on unsupervised learning which means that it does not require any training data. But, some text clustering algorithms cannot update existing clusters incrementally and, instead, have to recompute a new clustering from scratch. In view of above, this paper presents a novel down-top incremental conceptual hierarchical text clustering approach using CFu-tree (ICHTC-CF) representation, which starts with each item as a separate cluster. Term-based feature extraction is used for summarizing a cluster in the process. The Comparison Variation measure criterion is also adopted for judging whether the closest pair of clusters can be merged or a previous cluster can be split. And, our incremental clustering method is not sensitive to the input data order. Experimental results show that the performance of our method outperforms k-means, CLIQUE, single linkage clustering and complete linkage clustering, which indicate our new technique is efficient and feasible. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:269 / 278
页数:10
相关论文
共 50 条
  • [1] Clustering-based topical Web crawling using CFu-tree guided by link-context
    Liu, Lu
    Peng, Tao
    FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (04) : 581 - 595
  • [2] Clustering-based topical Web crawling using CFu-tree guided by link-context
    Lu Liu
    Tao Peng
    Frontiers of Computer Science, 2014, 8 : 581 - 595
  • [3] A general incremental hierarchical clustering method
    He, L. L.
    Bai, H. T.
    Sun, J. G.
    Jin, C. Z.
    COMPUTATIONAL METHODS, PTS 1 AND 2, 2006, : 1303 - +
  • [4] FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering
    Gunaratna, Kalpa
    Thirunarayan, Krishnaprasad
    Sheth, Amit
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 116 - 122
  • [5] Comprehensive Data Tree by Actor Messaging for Incremental Hierarchical Clustering
    Shimizu, Taiki
    Sakurai, Kohei
    2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 801 - 802
  • [6] Efficient Clustering Approach using Incremental and Hierarchical Clustering Methods
    Srinivas, M.
    Mohan, C. Krishna
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [7] TCBLHT: A new method of hierarchical text clustering
    Xu, JS
    Wang, L
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 2178 - 2181
  • [8] HIERARCHICAL CLUSTERING USING MINIMUM SPANNING TREE
    ROHLF, FJ
    COMPUTER JOURNAL, 1973, 16 (01): : 93 - 95
  • [9] A new core-based method for hierarchical incremental clustering
    Serban, G
    Câmpan, A
    SEVENTH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PROCEEDINGS, 2005, : 77 - 82
  • [10] ICGT: A novel incremental clustering approach based on GMM tree
    Wan, Yuchai
    Liu, Xiabi
    Wu, Yi
    Guo, Lunhao
    Chen, Qiming
    Wang, Murong
    DATA & KNOWLEDGE ENGINEERING, 2018, 117 : 71 - 86