An Efficient Outlier Detection Approach Over Uncertain Data Stream Based on Frequent Itemset Mining

被引:7
|
作者
Hao, Shangbo [1 ]
Cai, Saihua [1 ]
Sun, Ruizhi [1 ]
Li, Sicong [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 01期
关键词
outlier detection; frequent itemset mining; uncertain data stream; outlier factors; WINDOW;
D O I
10.5755/j01.itc.48.1.21162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is essential in data-based science. It aims to detect those itemsets that have a significant difference from the other data. With the limitations of equipment precision and network transmission, uncertain data are becoming more common in daily life. However, the traditional outlier detection methods are not applicable for uncertain data stream, and the large volume of data makes outlier detection costly in terms of memory usage and time. Moreover, the multiple scanning of the data stream required for Apriori-like methods is unrealistic. In this paper, a matrix structure is constructed to store the information of an uncertain data stream, and the subsequent mining process is conducted on the matrix structure; therefore, the whole data stream needs to be scanned only once. Then, the "upper cap" concept is used in the FIM-UDS method to mine the frequent itemsets more effectively to support outlier detection. Moreover, two outlier factors and an outlier detection method called FIM-UDSOD are designed to detect potential outliers. Finally, two public datasets are used to verify the efficiency of the FIM-UDS method, and one synthetic dataset is used to evaluate the FIM-UDSOD method. The experimental results show that our proposed FIM-UDSOD method is more effective than other methods in detecting outliers.
引用
收藏
页码:34 / 46
页数:13
相关论文
共 50 条
  • [41] An Efficient Algorithm for Mining Frequent Closed Itemsets over Data Stream
    Li Guodong
    Xia Kewen
    NEW TRENDS IN MECHATRONICS AND MATERIALS ENGINEERING, 2012, 151 : 570 - 575
  • [42] An Efficient Itemset Mining Approach for Data Streams
    Baralis, Elena
    Cerquitelli, Tania
    Chiusano, Silvia
    Grand, Alberto
    Grimaudo, Luigi
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 515 - 523
  • [43] MrFIM: A MapReduce Approach for Frequent Itemset Mining in Big Data
    Rahman, Abdul
    Manjaramkar, Arati
    2018 4TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [44] Frequent Itemset Mining for a Combination of Certain and Uncertain Databases
    Wazir, Samar
    Ahmad, Tanvir
    Beg, M. M. Sufyan
    RECENT DEVELOPMENTS AND THE NEW DIRECTION IN SOFT-COMPUTING FOUNDATIONS AND APPLICATIONS, 2018, 361 : 25 - 39
  • [45] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
    Al-Bana, Mohamed Reda
    Farhan, Marwa Salah
    Othman, Nermin Abdelhakim
    DATA, 2022, 7 (01)
  • [46] A sliding window based algorithm for frequent closed itemset mining over data streams
    Nori, Fatemeh
    Deypir, Mahmood
    Sadreddini, Mohamad Hadi
    JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (03) : 615 - 623
  • [47] An approximate approach to frequent itemset mining
    Zhang, Chunkai
    Zhang, Xudong
    Tian, Panbo
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 68 - 73
  • [48] Frequent Itemset Mining for Big Data
    Moens, Sandy
    Aksehirli, Emin
    Goethals, Bart
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [49] Frequent Itemset Mining for Big Data
    Chavan, Kiran
    Kulkarni, Priyanka
    Ghodekar, Pooja
    Patil, S. N.
    2015 International Conference on Green Computing and Internet of Things (ICGCIoT), 2015, : 1365 - 1368
  • [50] Frequent Itemset Mining with Elimination of Null Transactions Over Data Streams
    Subbulakshmi, B.
    Nayaki, A. Periya
    Deisy, C.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 : 353 - 361