An Efficient Outlier Detection Approach Over Uncertain Data Stream Based on Frequent Itemset Mining

被引:7
|
作者
Hao, Shangbo [1 ]
Cai, Saihua [1 ]
Sun, Ruizhi [1 ]
Li, Sicong [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 01期
关键词
outlier detection; frequent itemset mining; uncertain data stream; outlier factors; WINDOW;
D O I
10.5755/j01.itc.48.1.21162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is essential in data-based science. It aims to detect those itemsets that have a significant difference from the other data. With the limitations of equipment precision and network transmission, uncertain data are becoming more common in daily life. However, the traditional outlier detection methods are not applicable for uncertain data stream, and the large volume of data makes outlier detection costly in terms of memory usage and time. Moreover, the multiple scanning of the data stream required for Apriori-like methods is unrealistic. In this paper, a matrix structure is constructed to store the information of an uncertain data stream, and the subsequent mining process is conducted on the matrix structure; therefore, the whole data stream needs to be scanned only once. Then, the "upper cap" concept is used in the FIM-UDS method to mine the frequent itemsets more effectively to support outlier detection. Moreover, two outlier factors and an outlier detection method called FIM-UDSOD are designed to detect potential outliers. Finally, two public datasets are used to verify the efficiency of the FIM-UDS method, and one synthetic dataset is used to evaluate the FIM-UDSOD method. The experimental results show that our proposed FIM-UDSOD method is more effective than other methods in detecting outliers.
引用
收藏
页码:34 / 46
页数:13
相关论文
共 50 条
  • [1] Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream
    Saihua Cai
    Ruizhi Sun
    Shangbo Hao
    Sicong Li
    Gang Yuan
    Neural Computing and Applications, 2020, 32 : 6619 - 6639
  • [2] Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream
    Cai, Saihua
    Sun, Ruizhi
    Hao, Shangbo
    Li, Sicong
    Yuan, Gang
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (11): : 6619 - 6639
  • [3] FCI-Outlier: An Efficient Frequent Closed Itemset-Based Outlier Detecting Approach on Data Stream
    Hao, Shangbo
    Cai, Saihua
    Sun, Ruizhi
    Li, Sicong
    COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2018, 2019, 917 : 176 - 187
  • [4] MiFI-Outlier: Minimal infrequent itemset-based outlier detection approach on uncertain data stream
    Cai, Saihua
    Li, Sicong
    Yuan, Gang
    Hao, Shangbo
    Sun, Ruizhi
    KNOWLEDGE-BASED SYSTEMS, 2020, 191 (191)
  • [5] Efficient Incremental Itemset Tree for Approximate Frequent Itemset Mining On Data Stream
    Bai, Pavitra S.
    Kumar, Ravi G. K.
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2016, : 239 - 242
  • [6] Probabilistic frequent itemset mining over uncertain data streams
    Li, Haifeng
    Zhang, Ning
    Zhu, Jianming
    Wang, Yue
    Cao, Huaihu
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 112 : 274 - 287
  • [7] A data mining proxy approach for efficient frequent itemset mining
    Jeffrey Xu Yu
    Zhiheng Li
    Guimei Liu
    The VLDB Journal, 2008, 17 : 947 - 970
  • [8] A data mining proxy approach for efficient frequent itemset mining
    Yu, Jeffrey Xu
    Li, Zhiheng
    Liu, Guimei
    VLDB JOURNAL, 2008, 17 (04): : 947 - 970
  • [9] Efficient Probabilistic Frequent Itemset Mining in Big Sparse Uncertain Data
    Xu, Jing
    Li, Ning
    Mao, Xiao-Jiao
    Yang, Yu-Bin
    PRICAI 2014: TRENDS IN ARTIFICIAL INTELLIGENCE, 2014, 8862 : 235 - 247
  • [10] Differential Privacy Frequent Closed Itemset Mining over Data Stream
    Ma, Xuebin
    Guan, Shengyi
    Lang, Yanan
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 865 - 872