A survey on algorithms for mining frequent itemsets over data streams

被引:0
|
作者
James Cheng
Yiping Ke
Wilfred Ng
机构
[1] The Hong Kong University of Science and Technology,Department of Computer Science and Engineering
[2] HKUST,undefined
来源
关键词
Frequent itemsets; Stream mining; Window models; Approximate algorithms;
D O I
暂无
中图分类号
学科分类号
摘要
The increasing prominence of data streams arising in a wide range of advanced applications such as fraud detection and trend learning has led to the study of online mining of frequent itemsets (FIs). Unlike mining static databases, mining data streams poses many new challenges. In addition to the one-scan nature, the unbounded memory requirement and the high data arrival rate of data streams, the combinatorial explosion of itemsets exacerbates the mining task. The high complexity of the FI mining problem hinders the application of the stream mining techniques. We recognize that a critical review of existing techniques is needed in order to design and develop efficient mining algorithms and data structures that are able to match the processing rate of the mining with the high arrival rate of data streams. Within a unifying set of notations and terminologies, we describe in this paper the efforts and main techniques for mining data streams and present a comprehensive survey of a number of the state-of-the-art algorithms on mining frequent itemsets over data streams. We classify the stream-mining techniques into two categories based on the window model that they adopt in order to provide insights into how and why the techniques are useful. Then, we further analyze the algorithms according to whether they are exact or approximate and, for approximate approaches, whether they are false-positive or false-negative. We also discuss various interesting issues, including the merits and limitations in existing research and substantive areas for future research.
引用
收藏
页码:1 / 27
页数:26
相关论文
共 50 条
  • [21] An efficient approach to mining frequent itemsets on data streams
    Ansari, Sara
    Sadreddini, Mohammad Hadi
    World Academy of Science, Engineering and Technology, 2009, 37 : 489 - 495
  • [22] Mining of Frequent Itemsets from Streams of Uncertain Data
    Leung, Carson Kai-Sang
    Hao, Boyu
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1663 - 1670
  • [23] Efficient mining of frequent itemsets from data streams
    Leung, Carson Kai-Sang
    Brajczuk, Dale A.
    SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 2 - 14
  • [24] Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
    En Tzu Wang
    Arbee L. P. Chen
    Data Mining and Knowledge Discovery, 2011, 23 : 252 - 299
  • [25] Variable support mining of frequent itemsets over data streams using synopsis vectors
    Lin, Ming-Yen
    Hsueh, Sue-Chen
    Hwang, Sheng-Kun
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 724 - 728
  • [26] Mining frequent closed itemsets from a landmark window over online data streams
    Liu, Xuejun
    Guan, Jihong
    Hu, Ping
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 57 (06) : 927 - 936
  • [27] Mining frequent itemsets over data streams using efficient window sliding techniques
    Li, Hua-Fu
    Lee, Suh-Yin
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1466 - 1477
  • [28] Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
    Wang, En Tzu
    Chen, Arbee L. P.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 23 (02) : 252 - 299
  • [29] A frequent itemsets mining algorithm based on matrix in sliding window over data streams
    Fan Guidan
    Yin Shaohong
    2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 66 - 69
  • [30] Efficient maintenance and mining of frequent itemsets over Online data streams with a sliding window
    Hua-Fu Li
    Chin-Chuan Ho
    Man-Kwan Shan
    Suh-Yin Lee
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 2672 - +