A survey on algorithms for mining frequent itemsets over data streams

被引:0
|
作者
James Cheng
Yiping Ke
Wilfred Ng
机构
[1] The Hong Kong University of Science and Technology,Department of Computer Science and Engineering
[2] HKUST,undefined
来源
关键词
Frequent itemsets; Stream mining; Window models; Approximate algorithms;
D O I
暂无
中图分类号
学科分类号
摘要
The increasing prominence of data streams arising in a wide range of advanced applications such as fraud detection and trend learning has led to the study of online mining of frequent itemsets (FIs). Unlike mining static databases, mining data streams poses many new challenges. In addition to the one-scan nature, the unbounded memory requirement and the high data arrival rate of data streams, the combinatorial explosion of itemsets exacerbates the mining task. The high complexity of the FI mining problem hinders the application of the stream mining techniques. We recognize that a critical review of existing techniques is needed in order to design and develop efficient mining algorithms and data structures that are able to match the processing rate of the mining with the high arrival rate of data streams. Within a unifying set of notations and terminologies, we describe in this paper the efforts and main techniques for mining data streams and present a comprehensive survey of a number of the state-of-the-art algorithms on mining frequent itemsets over data streams. We classify the stream-mining techniques into two categories based on the window model that they adopt in order to provide insights into how and why the techniques are useful. Then, we further analyze the algorithms according to whether they are exact or approximate and, for approximate approaches, whether they are false-positive or false-negative. We also discuss various interesting issues, including the merits and limitations in existing research and substantive areas for future research.
引用
收藏
页码:1 / 27
页数:26
相关论文
共 50 条
  • [1] A survey on algorithms for mining frequent itemsets over data streams
    Cheng, James
    Ke, Yiping
    Ng, Wilfred
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 16 (01) : 1 - 27
  • [2] Mining of Probabilistic Frequent Itemsets over Uncertain Data Streams
    Liu Lixin
    Zhang Xiaolin
    Zhang Huanxiang
    2014 11TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2014, : 231 - 237
  • [3] A Mining Maximal Frequent Itemsets over the Entire History of Data Streams
    Mao, Yinmin
    Li, Hong
    Yang, Lumin
    Chen, Zhigang
    Liu, Lixin
    FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 413 - 417
  • [4] Online mining (recently) maximal frequent itemsets over data streams
    Li, HF
    Lee, SY
    Shan, MK
    15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Proceedings, 2005, : 11 - 18
  • [5] Mining maximal frequent itemsets in a sliding window over data streams
    Mao Y.
    Li H.
    Yang L.
    Liu L.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (11): : 1142 - 1148
  • [6] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Yu, Shao-jun
    2011 SECOND INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND EDUCATION APPLICATION (ICEA 2011), 2011, : 197 - 201
  • [7] Mining recent frequent itemsets in sliding windows over data streams
    Congying Han
    Lijun Xu
    Guoping He
    COMPUTING AND INFORMATICS, 2008, 27 (03) : 315 - 339
  • [8] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Bu, Yingyong
    Yang, Bo
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 3, PROCEEDINGS, 2009, : 65 - +
  • [9] An efficient algorithm for mining maximal frequent itemsets over data streams
    Mao Y.
    Yang L.
    Li H.
    Chen Z.
    Liu L.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (03): : 246 - 252
  • [10] Approximate mining of global closed frequent itemsets over data streams
    Guo, Lichao
    Su, Hongye
    Qu, Yu
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2011, 348 (06): : 1052 - 1081