Data stream clustering: a review

被引:0
|
作者
Alaettin Zubaroğlu
Volkan Atalay
机构
[1] Middle East Technical University,Department of Computer Engineering
来源
关键词
Data streams; Data stream clustering; Real-time clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for real-time data stream processing, because it can be applied with less prior information about the data and it does not need labeled instances. However, data stream clustering differs from traditional clustering in many aspects and it has several challenging issues. Here, we provide information regarding the concepts and common characteristics of data streams, such as concept drift, data structures for data streams, time window models and outlier detection. We comprehensively review recent data stream clustering algorithms and analyze them in terms of the base clustering technique, computational complexity and clustering accuracy. A comparison of these algorithms is given along with still open problems. We indicate popular data stream repositories and datasets, stream processing tools and platforms. Open problems about data stream clustering are also discussed.
引用
收藏
页码:1201 / 1236
页数:35
相关论文
共 50 条
  • [31] Effective clustering algorithm for probabilistic data stream
    Dai, Dong-Bo
    Zhao, Gang
    Sun, Sheng-Li
    Ruan Jian Xue Bao/Journal of Software, 2009, 20 (05): : 1313 - 1328
  • [32] Clustering-training for data stream mining
    Wu, Shuang
    Yang, Chunyu
    Zhou, Jie
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 653 - +
  • [33] An Adaptive Density Data Stream Clustering Algorithm
    Ding, Shifei
    Zhang, Jian
    Jia, Hongjie
    Qian, Jun
    COGNITIVE COMPUTATION, 2016, 8 (01) : 30 - 38
  • [34] Data stream clustering based on immune principle
    Department of Computing Information and Technology, Fudan University, Shanghai 200433, China
    不详
    不详
    Moshi Shibie yu Rengong Zhineng, 2009, 2 (246-255):
  • [35] A Comparative Study on Data Stream Clustering Algorithms
    Keshvani, Twinkle
    Shukla, Madhu
    PROCEEDING OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS, BIG DATA AND IOT (ICCBI-2018), 2020, 31 : 219 - 230
  • [36] Varying density method for data stream clustering
    Mousavi, Maryam
    Khotanlou, Hassan
    Abu Bakar, Azuraliza
    Vakilian, Mohammadmahdi
    APPLIED SOFT COMPUTING, 2020, 97
  • [37] How to Use Ants for Data Stream Clustering
    Masmoudi, Nesrine
    Azzag, Hanane
    Lebbah, Mustapha
    Bertelle, Cyrille
    Ben Jemaa, Maher
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 656 - 663
  • [38] Scalable Data Stream Clustering with k Estimation
    Candido, Paulo L.
    Naldi, Murilo C.
    Silva, Jonathan A.
    Faria, Elaine R.
    2017 6TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2017, : 336 - 341
  • [39] Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms
    Matthias Carnein
    Heike Trautmann
    Business & Information Systems Engineering, 2019, 61 : 277 - 297
  • [40] Introduction to stream: An Extensible Framework for Data Stream Clustering Research with R
    Hahsler, Michael
    Bolanos, Matthew
    Forrest, John
    JOURNAL OF STATISTICAL SOFTWARE, 2017, 76 (14): : 1 - 50