A novel incremental conceptual hierarchical text clustering method using CFu-tree

被引:13
|
作者
Peng, Tao [1 ,2 ,3 ]
Liu, Lu [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[3] Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China
关键词
Text clustering; CFu-tree; Comparison Variation (CV); Incremental hierarchical clustering; EFFICIENT ALGORITHM;
D O I
10.1016/j.asoc.2014.11.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a data mining method, clustering, which is one of the most important tools in information retrieval, organizes data based on unsupervised learning which means that it does not require any training data. But, some text clustering algorithms cannot update existing clusters incrementally and, instead, have to recompute a new clustering from scratch. In view of above, this paper presents a novel down-top incremental conceptual hierarchical text clustering approach using CFu-tree (ICHTC-CF) representation, which starts with each item as a separate cluster. Term-based feature extraction is used for summarizing a cluster in the process. The Comparison Variation measure criterion is also adopted for judging whether the closest pair of clusters can be merged or a previous cluster can be split. And, our incremental clustering method is not sensitive to the input data order. Experimental results show that the performance of our method outperforms k-means, CLIQUE, single linkage clustering and complete linkage clustering, which indicate our new technique is efficient and feasible. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:269 / 278
页数:10
相关论文
共 50 条
  • [31] Hierarchical Text Clustering and Categorisation using A Semi-Supervised Framework
    Mahyoub, Mohamed
    Hind, Jade
    Woods, David
    Wong, Carl
    Hussain, Abir
    Aljumeily, Dhiya
    12TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2019), 2019, : 153 - 159
  • [32] Clustering Sentence Level-Text using Fuzzy Hierarchical Algorithm
    Priya, G. Krishna
    Anupriya, G.
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
  • [33] Domain taxonomy learning from text: The subsumption method versus hierarchical clustering
    de Knijff, Jeroen
    Frasincar, Flavius
    Hogenboom, Frederik
    DATA & KNOWLEDGE ENGINEERING, 2013, 83 : 54 - 69
  • [34] A Novel Spreading Framework Using Incremental Clustering for Viral Marketing
    AlSuwaidan, Lulwah
    Ykhlef, Mourad
    Alnuem, Mohammed Abdullah
    2014 IEEE/ACS 11TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2014, : 78 - 83
  • [35] A novel clustering approach using hierarchical genetic algorithms
    Lai, CC
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2005, 11 (03): : 143 - 153
  • [36] Minimum spanning tree based split-and-merge: A hierarchical clustering method
    Zhong, Caiming
    Miao, Duoqian
    Franti, Pasi
    INFORMATION SCIENCES, 2011, 181 (16) : 3397 - 3410
  • [37] A parallel text clustering method using Spark and hashing
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    COMPUTING, 2021, 103 (09) : 2007 - 2031
  • [38] A parallel text clustering method using Spark and hashing
    Mohamed Aymen Ben HajKacem
    Chiheb-Eddine Ben N’cir
    Nadia Essoussi
    Computing, 2021, 103 : 2007 - 2031
  • [39] Spark Based Text Clustering Method Using Hashing
    Ben HajKacem, Mohamed Aymen
    Ben N'Cir, Chiheb-Eddine
    Essoussi, Nadia
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2021), 2021, 12925 : 137 - 142
  • [40] A Novel Stable Clustering Design Method for Hierarchical Satellite Network
    Zhou MuGuo QingWang Zhenyong Communication Research CenterSchool of Electronics and Information EngineeringHarbin Institute of TechnologyHarbin China
    Chinese Journal of Aeronautics, 2010, 23 (01) : 91 - 102