A clustering algorithm for data stream based on grid-tree and similarity

被引:0
|
作者
Huang G. [1 ]
Guo W. [1 ]
Ren J. [1 ]
Chen L. [1 ]
机构
[1] College of Information Science and Engineering
关键词
Clustering; Data stream; Grid; Similarity;
D O I
10.4156/ijact.vol3.issue9.3
中图分类号
学科分类号
摘要
Algorithms based on k-means are incompetent to find clusters of arbitrary shapes, and the number of clusters needs to be pre-specified. Moreover, most grid-based clustering algorithms can not deal with boundary points accurately. To address these issues, a novel approach based on density gird-tree and similarity, DGTSstream, is proposed. In DGTSstream, each new data record will be mapped into the gird-tree, and sporadic grids will be removed through setting update cycle and noise density threshold. The average density is exploited to design density threshold. This algorithm repeatedly seeks a maximum density grid without cluster flag, which will be used as a starting point for finding clusters according to depth-first strategy. Finally, the similarity is adopted to deal with the boundary points. Experimental results show that our algorithm can find clusters of arbitrary shapes, and has better clustering accuracy and efficiency.
引用
收藏
页码:17 / 24
页数:7
相关论文
共 50 条
  • [41] Data Stream Clustering Algorithm Based on Affinity Propagation and Density
    Li Yang
    Tan Baihong
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 444 - 449
  • [42] A dynamic data stream clustering algorithm based on probability and exemplar
    Bi A.
    Dong A.
    Wang S.
    1600, Science Press (53): : 1029 - 1042
  • [43] Incremental clustering algorithm based on rough reduction for data stream
    College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    Xinan Jiaotong Daxue Xuebao, 2009, 5 (637-643+653):
  • [44] Clustering over data streams based on grid density and index tree
    Ren J.
    Cai B.
    Hu C.
    Journal of Convergence Information Technology, 2011, 6 (01) : 83 - 93
  • [45] Similarity query processing algorithm over data stream based on LCSS
    Wang, Shaopeng
    Wen, Yingyou
    Zhao, Hong
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (09): : 1976 - 1991
  • [46] Research on multiradar data fusion algorithm based on grid clustering
    Shu, Hong-Ping
    Xu, Zheng-Ming
    Zou, Shu-Rong
    He, Jia
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2007, 36 (06): : 1253 - 1256
  • [47] A Grid Based Clustering Algorithm
    Zhang, Qiang
    2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [48] Tree-Based Algorithm for Stable and Efficient Data Clustering
    Aljabbouli, Hasan
    Albizri, Abdullah
    Harfouche, Antoine
    INFORMATICS-BASEL, 2020, 7 (04):
  • [49] A clustering algorithm for multiple data streams based on spectral component similarity
    Chen Ling
    Zou Ling-Jun
    Tu Li
    INFORMATION SCIENCES, 2012, 183 (01) : 35 - 47
  • [50] A Clustering Algorithm for Multiple Data Streams Based on Spectral Component Similarity
    Zou Lingjun
    Chen Ling
    Tu Ii
    ICCSE 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION: ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, 2008, : 595 - 603