A flocking based algorithm for document clustering analysis

被引:54
|
作者
Cui, Xiaohui [1 ]
Gao, Jinzhu [1 ]
Potok, Thomas E. [1 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
关键词
document clustering; bio-inspired; agent; flocking model; F-measure;
D O I
10.1016/j.sysarc.2006.02.003
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K-means and the Ant clustering algorithm for real document clustering. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:505 / 515
页数:11
相关论文
共 50 条
  • [31] A distributed agent implementation of multiple species flocking model for document partitioning clustering
    Cui, Xiaohui
    Potok, Thomas E.
    COOPERATIVE INFORMATION AGENTS X, PROCEEDINGS, 2006, 4149 : 124 - 137
  • [32] Document clustering analysis with aid of adaptive Jaro Winkler with Jellyfish search clustering algorithm
    Pitchandi, Perumal
    Balakrishnan, Mathivanan
    ADVANCES IN ENGINEERING SOFTWARE, 2023, 175
  • [33] Document clustering analysis with aid of adaptive Jaro Winkler with Jellyfish search clustering algorithm
    Pitchandi, Perumal
    Balakrishnan, Mathivanan
    Advances in Engineering Software, 2023, 175
  • [34] Application of fuzzy clustering algorithm in Chinese document clustering
    Li, Jiafu
    Zhang, Yafei
    Lu, Jianjiang
    Jisuanji Gongcheng/Computer Engineering, 2002, 28 (04):
  • [35] Application of Algorithm CARDBK in Document Clustering
    ZHU Yehang
    ZHANG Mingjie
    SHI Feng
    WuhanUniversityJournalofNaturalSciences, 2018, 23 (06) : 514 - 524
  • [36] A Robust Algorithm for Fuzzy Document Clustering
    Chen, Lifei
    Wang, Shengrui
    Jiang, Qingshan
    2009 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS: WAINA, VOLS 1 AND 2, 2009, : 679 - +
  • [37] An extended chameleon algorithm for document clustering
    AmritaVishwaVidyapeetham, Dept. of Computer Science and Application, India
    Adv. Intell. Sys. Comput., (335-348):
  • [38] An improved clustering algorithm for web document
    Wang, Jing
    Liu, Zhijing
    Journal of Information and Computational Science, 2009, 6 (02): : 959 - 966
  • [39] A Novel Algorithm for Automatic Document Clustering
    Agrawal, Ranjana
    Phatak, Madhura
    PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 877 - 882
  • [40] Frequent Document Mining Algorithm with Clustering
    Soni, Rakesh Kumar
    Gupta, Neetesh
    Sinhal, Amit
    Sahu, Shiv K.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (09): : 38 - 43