A clustering algorithm for data stream based on grid-tree and similarity

被引:0
|
作者
Huang G. [1 ]
Guo W. [1 ]
Ren J. [1 ]
Chen L. [1 ]
机构
[1] College of Information Science and Engineering
关键词
Clustering; Data stream; Grid; Similarity;
D O I
10.4156/ijact.vol3.issue9.3
中图分类号
学科分类号
摘要
Algorithms based on k-means are incompetent to find clusters of arbitrary shapes, and the number of clusters needs to be pre-specified. Moreover, most grid-based clustering algorithms can not deal with boundary points accurately. To address these issues, a novel approach based on density gird-tree and similarity, DGTSstream, is proposed. In DGTSstream, each new data record will be mapped into the gird-tree, and sporadic grids will be removed through setting update cycle and noise density threshold. The average density is exploited to design density threshold. This algorithm repeatedly seeks a maximum density grid without cluster flag, which will be used as a starting point for finding clusters according to depth-first strategy. Finally, the similarity is adopted to deal with the boundary points. Experimental results show that our algorithm can find clusters of arbitrary shapes, and has better clustering accuracy and efficiency.
引用
收藏
页码:17 / 24
页数:7
相关论文
共 50 条
  • [1] Clustering Algorithm Based on Grid and Density for Data Stream
    Wang, Lang
    Li, Haiqing
    MATERIALS SCIENCE, ENERGY TECHNOLOGY, AND POWER ENGINEERING I, 2017, 1839
  • [2] A Density Granularity Grid Clustering Algorithm Based on Data Stream
    Wang, Li-fang
    Han, Xie
    EMERGING RESEARCH IN WEB INFORMATION SYSTEMS AND MINING, 2011, 238 : 113 - 120
  • [3] A Clustering Algorithm Based on Density-Grid for Stream Data
    Zhang, Dandan
    Tian, Hui
    Sang, Yingpeng
    Li, Yidong
    Wu, Yanbo
    Wu, Jun
    Shen, Hong
    2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, : 398 - 403
  • [4] A Data Stream Clustering Algorithm Based on Density and Extended Grid
    Hua, Zheng
    Du, Tao
    Qu, Shouning
    Mou, Guodong
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT II, 2017, 10362 : 689 - 699
  • [5] An Incremental Algorithm Based on Irregular Grid for Clustering Data Stream
    Yin, Guisheng
    Yu, Xiang
    Yang, Guang
    2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 5680 - 5684
  • [6] A Grid and Fractal Dimension-Based Data Stream Clustering Algorithm
    Lin, Guoping
    Chen, Leisong
    ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 1, 2008, : 66 - +
  • [7] A Grid and Density-based Clustering Algorithm for Processing Data Stream
    Jia, Chen
    Tan, ChengYu
    Yong, Ai
    SECOND INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING: WGEC 2008, PROCEEDINGS, 2008, : 517 - +
  • [8] An Algorithm of Dynamic Grid Data Stream Clustering Based on User Service
    Wang, Li-Fang
    Han, Xie
    HIGH PERFORMANCE NETWORKING, COMPUTING, AND COMMUNICATION SYSTEMS, 2011, 163 : 81 - 88
  • [9] A Kind of Data Stream Clustering Algorithm Based on Grid-Density
    Zhong Zhishui
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 418 - 423
  • [10] FGCH: a fast and grid based clustering algorithm for hybrid data stream
    Chen, Jinyin
    Lin, Xiang
    Xuan, Qi
    Xiang, Yun
    APPLIED INTELLIGENCE, 2019, 49 (04) : 1228 - 1244