A flocking based algorithm for document clustering analysis

被引:54
|
作者
Cui, Xiaohui [1 ]
Gao, Jinzhu [1 ]
Potok, Thomas E. [1 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
关键词
document clustering; bio-inspired; agent; flocking model; F-measure;
D O I
10.1016/j.sysarc.2006.02.003
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K-means and the Ant clustering algorithm for real document clustering. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:505 / 515
页数:11
相关论文
共 50 条
  • [1] Flocking-based Document Clustering on the Graphics Processing Unit
    Charles, Jesse St.
    Potok, Thomas E.
    Patton, Robert
    Cui, Xiaohui
    NATURE INSPIRED COOPERATIVE STRATEGIES FOR OPTIMIZATION (NICSO 2007), 2008, 129 : 27 - +
  • [2] A Document Clustering Method based on Hierarchical Algorithm with Model Clustering
    Sun, Haojun
    Liu, Zhihui
    Kong, Lingjun
    2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 1229 - +
  • [3] Induction Radius-Based Hierarchical Flocking Algorithm for UAVs Clustering
    Qiao, Xuhui
    Xu, Ziqiang
    Luo, Yanhong
    Sun, Yifan
    Liu, Yuxuan
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 170 - 176
  • [4] WAF-based Document Clustering Algorithm
    Luo, Yang
    Chen, Guang
    Zhang, Yongtian
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 14 - 16
  • [5] An adaptive flocking algorithm for performing approximate clustering
    Folino, Gianluigi
    Forestiero, Agostino
    Spezzano, Giandomenico
    INFORMATION SCIENCES, 2009, 179 (18) : 3059 - 3078
  • [6] A new document clustering algorithm based on association rule
    Song, JC
    Shen, JY
    Song, QB
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1310 - 1313
  • [7] A parallel text document clustering algorithm based on neighbors
    Li, Yanjun
    Luo, Congnan
    Chung, Soon M.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 933 - 948
  • [8] Fuzzy Document Clustering Based on Ant Colony Algorithm
    Wang, Fei
    Zhang, Dexian
    Bao, Na
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 2, PROCEEDINGS, 2009, 5552 : 709 - 716
  • [9] A fuzzy-based algorithm for Web document clustering
    Friedman, M
    Kandel, A
    Schneider, M
    Last, M
    Shapira, B
    Elovici, Y
    Zaafrany, O
    NAFIPS 2004: ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, VOLS 1AND 2: FUZZY SETS IN THE HEART OF THE CANADIAN ROCKIES, 2004, : 524 - 527
  • [10] CSIM: A document clustering algorithm based on Swarm Intelligence
    Wu, B
    Zheng, Y
    Liu, SH
    Shi, ZZ
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 477 - 482