An outlier detection approach in large-scale data stream using rough set

被引:8
|
作者
Singh, Manmohan [1 ]
Pamula, Rajendra [1 ]
机构
[1] Indian Sch Mines, Indian Inst Technol, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
来源
NEURAL COMPUTING & APPLICATIONS | 2020年 / 32卷 / 13期
关键词
Relative information entropy; Outlier detection; Rough sets; Data mining; Indiscernible sets; INFORMATION-ENTROPY; UNCERTAINTY; REDUCTION;
D O I
10.1007/s00521-019-04421-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection has become an important research area in the field of stream data mining due to its vast applications. In the literature, many methods have been proposed, but they work well for simple and positive regions of outliers, where boundary regions are not given much importance. Moreover, an algorithm which processes stream data must be effective and able to compute infinite data in one pass or limited number of passes. These problems have motivated us to propose an outlier detection approach for large-scale data stream. The proposed algorithm employs the concept of relative cardinality, entropy outlier factor theory of information-based system, and size-variant sliding window in stream data. In addition, we propose a new methodology for concept drift adaptation on evolving data streams. The proposed method is executed on nine benchmark datasets and compared with six existing methods that are EXPoSE, iForest, OC-SVM, LOF, KDE, and FastAbod. Experimental results show that the proposed method outperforms six existing methods in terms of receiver operating characteristic curve, precision recall, and computational time for positive regions as well as for boundary regions.
引用
收藏
页码:9113 / 9127
页数:15
相关论文
共 50 条
  • [1] An outlier detection approach in large-scale data stream using rough set
    Manmohan Singh
    Rajendra Pamula
    Neural Computing and Applications, 2020, 32 : 9113 - 9127
  • [2] A rough set approach to outlier detection
    Jiang, Feng
    Sui, Yuefei
    Cao, Cungen
    INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2008, 37 (05) : 519 - 536
  • [3] Outlier Detection in Large-Scale Sensor Network Data Using Shrinkage Estimators
    Wu, Ming-Chun
    Chen, Kwang-Cheng
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [4] Outlier detection using rough set theory
    Jiang, F
    Sui, YF
    Cao, CG
    ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, PT 2, PROCEEDINGS, 2005, 3642 : 79 - 87
  • [5] Outlier Detection Forest for Large-Scale Categorical Data Sets
    Sun, Zhipeng
    Du, Hongwei
    Ye, Qiang
    Liu, Chuang
    Kibenge, Patricia Lilian
    Huang, Hui
    Li, Yuying
    COMPUTATIONAL DATA AND SOCIAL NETWORKS, 2019, 11917 : 45 - 56
  • [6] Anomaly detection in large-scale data stream networks
    Duc-Son Pham
    Venkatesh, Svetha
    Lazarescu, Mihai
    Budhaditya, Saha
    DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 145 - 189
  • [7] Anomaly detection in large-scale data stream networks
    Duc-Son Pham
    Svetha Venkatesh
    Mihai Lazarescu
    Saha Budhaditya
    Data Mining and Knowledge Discovery, 2014, 28 : 145 - 189
  • [8] An Algorithm for Outlier Detection Using Rough Set Theory
    Lou, Mingzhu
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING APPLICATIONS (CSEA 2015), 2015, : 99 - 103
  • [9] Information-Theoretic Outlier Detection for Large-Scale Categorical Data
    Wu, Shu
    Wang, Shengrui
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (03) : 589 - 602
  • [10] Outlier Detection and Elimination in Stream Data - An Experimental Approach
    Kalisch, Mateusz
    Michalak, Marcin
    Przystalka, Piotr
    Sikora, Marek
    Wrobel, Lukasz
    ROUGH SETS, (IJCRS 2016), 2016, 9920 : 416 - 426