An Efficient Outlier Detection Approach Over Uncertain Data Stream Based on Frequent Itemset Mining

被引:7
|
作者
Hao, Shangbo [1 ]
Cai, Saihua [1 ]
Sun, Ruizhi [1 ]
Li, Sicong [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 01期
关键词
outlier detection; frequent itemset mining; uncertain data stream; outlier factors; WINDOW;
D O I
10.5755/j01.itc.48.1.21162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is essential in data-based science. It aims to detect those itemsets that have a significant difference from the other data. With the limitations of equipment precision and network transmission, uncertain data are becoming more common in daily life. However, the traditional outlier detection methods are not applicable for uncertain data stream, and the large volume of data makes outlier detection costly in terms of memory usage and time. Moreover, the multiple scanning of the data stream required for Apriori-like methods is unrealistic. In this paper, a matrix structure is constructed to store the information of an uncertain data stream, and the subsequent mining process is conducted on the matrix structure; therefore, the whole data stream needs to be scanned only once. Then, the "upper cap" concept is used in the FIM-UDS method to mine the frequent itemsets more effectively to support outlier detection. Moreover, two outlier factors and an outlier detection method called FIM-UDSOD are designed to detect potential outliers. Finally, two public datasets are used to verify the efficiency of the FIM-UDS method, and one synthetic dataset is used to evaluate the FIM-UDSOD method. The experimental results show that our proposed FIM-UDSOD method is more effective than other methods in detecting outliers.
引用
收藏
页码:34 / 46
页数:13
相关论文
共 50 条
  • [31] An efficient closed frequent itemset miner for the MOA stream mining system
    Quadrana, Massimo
    Bifet, Albert
    Gavalda, Ricard
    AI COMMUNICATIONS, 2015, 28 (01) : 143 - 158
  • [32] A False Negative Maximal Frequent Itemset Mining Algorithm over Stream
    Li, Haifeng
    Zhang, Ning
    ADVANCED DATA MINING AND APPLICATIONS, PT I, 2011, 7120 : 29 - +
  • [33] Probabilistic Frequent Itemset Mining in Uncertain Databases
    Bernecker, Thomas
    Kriegel, Hans-Peter
    Renz, Matthias
    Verhein, Florian
    Zuefle, Andreas
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 119 - 127
  • [34] BISC: A Bitmap Itemset Support Counting Approach for Efficient Frequent Itemset Mining
    Chen, Jinlin
    Xiao, Keli
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
  • [35] An Efficient Frequent Itemset Mining Method over High-speed Data Streams
    Memar, Mina
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    Fakhrahmad, Seyyed Mostafa
    COMPUTER JOURNAL, 2012, 55 (11): : 1357 - 1366
  • [36] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Saihua Cai
    Li Li
    Qian Li
    Sicong Li
    Shangbo Hao
    Ruizhi Sun
    Applied Intelligence, 2020, 50 : 3452 - 3470
  • [37] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Cai, Saihua
    Li, Li
    Li, Qian
    Li, Sicong
    Hao, Shangbo
    Sun, Ruizhi
    APPLIED INTELLIGENCE, 2020, 50 (10) : 3452 - 3470
  • [38] Efficient algorithm for frequent pattern mining over uncertain data streams
    Du, Congqiang
    Shao, Zengzhen
    Journal of Computational Information Systems, 2015, 11 (21): : 7799 - 7808
  • [39] Efficient Frequent Itemset Mining from Dense Data Streams
    Cuzzocrea, Alfredo
    Jiang, Fan
    Lee, Wookey
    Leung, Carson K.
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 593 - 601
  • [40] An efficient frequent itemset mining algorithm
    Luo, Ke
    Zhang, Xue-Mao
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 756 - 761