A Novel Rough Set Based Clustering Approach for Streaming Data

被引:0
|
作者
Yogita [1 ]
Toshniwal, Durga [1 ]
机构
[1] Indian Inst Technol, Roorkee, Uttar Pradesh, India
关键词
Clustering; Streaming data; Cluster approximation; Rough set;
D O I
10.1007/978-81-322-1602-5_131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a very important data mining task. Clustering of streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving in data over time. Inherent uncertainty involved in real world data stream further magnifies the challenge of working with streaming data. Rough set is a soft computing technique which can be used to deal with uncertainty involved in cluster analysis. In this paper, we propose a novel rough set based clustering method for streaming data. It describes a cluster as a pair of lower approximation and an upper approximation. Lower approximation comprises of the data objects that can be assigned with certainty to the respective cluster, whereas upper approximation contains those data objects whose belongingness to the various clusters in not crisp along with the elements of lower approximation. Uncertainty in assigning a data object to a cluster is captured by allowing overlapping in upper approximation. Proposed method generates soft-cluster. Keeping in view the challenges of streaming data, the proposed method is incremental and adaptive to evolving concept. Experimental results on synthetic and real world data sets show that our proposed approach outperforms Leader clustering algorithm in terms of classification accuracy. Proposed method generates more natural clusters as compare to k-means clustering and it is robust to outliers. Performance of proposed method is also analyzed in terms of correctness and accuracy of rough clustering.
引用
收藏
页码:1253 / 1265
页数:13
相关论文
共 50 条
  • [11] An integrated covering-based rough fuzzy set clustering approach for sequential data
    Prabhavathy, P.
    Tripathy, B.K.
    International Journal of Reasoning-based Intelligent Systems, 2015, 7 (3-4) : 296 - 304
  • [12] A rough set theoretic approach to clustering
    De, SK
    FUNDAMENTA INFORMATICAE, 2004, 62 (3-4) : 409 - 417
  • [13] ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set
    Jinghua Liu
    Yaojin Lin
    Jixiang Du
    Hongbo Zhang
    Ziyi Chen
    Jia Zhang
    Applied Intelligence, 2023, 53 : 1707 - 1724
  • [14] ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set
    Liu, Jinghua
    Lin, Yaojin
    Du, Jixiang
    Zhang, Hongbo
    Chen, Ziyi
    Zhang, Jia
    APPLIED INTELLIGENCE, 2023, 53 (02) : 1707 - 1724
  • [15] A Rough Set System for Mining from Streaming Data
    Wei, Yidong
    Leung, Carson K.
    Li, Cheng
    2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,
  • [16] A Novel Data Mining Algorithm Based on Rough Set
    Yao, Yufeng
    SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING: THEORY AND PRACTICE, VOL 1, 2012, 114 : 1115 - 1121
  • [17] An approach for data filtering based on rough set theory
    Yin, XR
    Zhou, ZH
    Li, N
    Chen, SF
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2001, 2118 : 367 - 374
  • [18] An clustering algorithm based on rough set
    Xu, E.
    Gao Xuedong
    Sen, Wu
    Bin, Yu
    2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 466 - 469
  • [19] A new data mining approach based on rough set
    Dai, Shangping
    He, Tian
    Me, Xiangming
    DCABES 2007 Proceedings, Vols I and II, 2007, : 776 - 780
  • [20] A rough set approach for selecting clustering attribute
    Herawan, Tutut
    Deris, Mustafa Mat
    Abawajy, Jemal H.
    KNOWLEDGE-BASED SYSTEMS, 2010, 23 (03) : 220 - 231