A clustering algorithm for data stream based on grid-tree and similarity

被引:0
|
作者
Huang G. [1 ]
Guo W. [1 ]
Ren J. [1 ]
Chen L. [1 ]
机构
[1] College of Information Science and Engineering
关键词
Clustering; Data stream; Grid; Similarity;
D O I
10.4156/ijact.vol3.issue9.3
中图分类号
学科分类号
摘要
Algorithms based on k-means are incompetent to find clusters of arbitrary shapes, and the number of clusters needs to be pre-specified. Moreover, most grid-based clustering algorithms can not deal with boundary points accurately. To address these issues, a novel approach based on density gird-tree and similarity, DGTSstream, is proposed. In DGTSstream, each new data record will be mapped into the gird-tree, and sporadic grids will be removed through setting update cycle and noise density threshold. The average density is exploited to design density threshold. This algorithm repeatedly seeks a maximum density grid without cluster flag, which will be used as a starting point for finding clusters according to depth-first strategy. Finally, the similarity is adopted to deal with the boundary points. Experimental results show that our algorithm can find clusters of arbitrary shapes, and has better clustering accuracy and efficiency.
引用
收藏
页码:17 / 24
页数:7
相关论文
共 50 条
  • [31] A Novel Clustering Algorithm for Prefix-coded Data Stream Based upon Median-tree
    Feng, Guangsheng
    Wang, Huiqiang
    Zhao, Qian
    Liang, Ying
    ICICSE: 2008 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING IN SCIENCE AND ENGINEERING, PROCEEDINGS, 2008, : 79 - 84
  • [32] Data clustering algorithm based on digital search tree
    Zhou, XH
    Wang, HB
    Zhou, DR
    Meng, B
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 1757 - 1761
  • [33] Clustering over an evolving data stream based on grid density and correlation
    Ren, Jiadong
    Cai, Binlei
    Hu, Changzhen
    ICIC Express Letters, 2010, 4 (05): : 1603 - 1609
  • [34] AN INCREMENTAL GRID CLUSTERING ALGORITHM BASED ON DENSITY-DIMENSION-TREE
    Huang, Jiaolong
    Zhang, Xiaolong
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 356 - 361
  • [35] Hierarchical Clustering Algorithm for Binary Data Based on Cosine Similarity
    Gao, Xiaonan
    Wu, Sen
    2018 8TH INTERNATIONAL CONFERENCE ON LOGISTICS, INFORMATICS AND SERVICE SCIENCES (LISS), 2018,
  • [36] An Improved Data Stream Algorithm for Clustering
    Kim, Sang-Sub
    Ahn, Hee-Kap
    LATIN 2014: THEORETICAL INFORMATICS, 2014, 8392 : 273 - 284
  • [37] An improved data stream algorithm for clustering
    Kim, Sang-Sub
    Ahn, Hee-Kap
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2015, 48 (09): : 635 - 645
  • [38] A data stream subspace clustering algorithm based on region partition
    Yu, X. (yuxpointfly@gmail.com), 1600, Science Press (51):
  • [39] Knowledge-based Evolving Clustering Algorithm for Data Stream
    Sun, Zhaoyang
    Mao, K. Z.
    Tang, Wenyin
    Mak, Lee-Onn
    Xian, Kuitong
    Liu, Ying
    2014 11TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM), 2014,
  • [40] AN EFFICIENT DATA STREAM CLUSTERING ALGORITHM BASED ON DYNAMIC GRIDS
    Yun Wu
    Gao Feng
    NEW TRENDS AND APPLICATIONS OF COMPUTER-AIDED MATERIAL AND ENGINEERING, 2011, 186 : 665 - +