A GSP-based efficient algorithm for mining frequent sequences

被引:0
|
作者
Zhang, MH [1 ]
Kao, B [1 ]
Yip, CL [1 ]
Cheung, D [1 ]
机构
[1] Univ Hong Kong, Dept Comp Sci & Informat Syst, Hong Kong, Hong Kong, Peoples R China
来源
IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III | 2001年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the problem of mining frequent sequences in transactional databases. In [6], Agrawal and Srikant proposed the GSP algorithm for extracting frequently occurring sequences. GSP is an iterative algorithm. It scans the database a number of times depending on the length of the longest frequent sequences in the database. The I/O cost is thus substantial if the database contains very long frequent sequences. In this paper, we extend the candidate generating function used by GSP and propose a new two-stage algorithm ATS. Our algorithm first mines a sample of the database to obtain a rough estimate of the frequent sequences and then refines the solution. Experiment results show that MFS saves I/O cost significantly compared with GSP.
引用
收藏
页码:497 / 503
页数:7
相关论文
共 50 条
  • [31] Efficient Algorithms for Mining and Incremental Update of Maximal Frequent Sequences
    Ben Kao
    Minghua Zhang
    Chi-Lap Yip
    David W. Cheung
    Usama Fayyad
    Data Mining and Knowledge Discovery, 2005, 10 : 87 - 116
  • [32] Efficient algorithms for mining frequent high utility sequences with constraints
    Tin Truong
    Hai Duong
    Bac Le
    Fournier-Viger, Philippe
    Yun, Unil
    Fujita, Hamido
    INFORMATION SCIENCES, 2021, 568 : 239 - 264
  • [33] An online frequency rate based algorithm for mining frequent sequences in evolving data streams
    Barouni-Ebrahimi, M.
    Ghorbani, Ali A.
    CHALLENGES IN INFORMATION TECHNOLOGY MANAGEMENT, 2008, : 56 - 62
  • [34] A scalable algorithm for mining maximal frequent sequences using a sample
    Luo, Congnan
    Chung, Soon M.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 149 - 179
  • [35] A scalable algorithm for mining maximal frequent sequences using a sample
    Congnan Luo
    Soon M. Chung
    Knowledge and Information Systems, 2008, 15 : 149 - 179
  • [36] A scalable algorithm for mining maximal frequent sequences using sampling
    Luo, C
    Chung, SM
    ICTAI 2004: 16TH IEEE INTERNATIONALCONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, : 156 - 165
  • [37] AN EFFICIENT ALGORITHM BASED ON TIME DECAY MODEL FOR MINING MAXIMAL FREQUENT ITEMSETS
    Huang, Guo-Yan
    Wang, Li-Bo
    Hu, Chang-Zhen
    Ren, Jia-Dong
    He, Hui-Ling
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 2063 - +
  • [38] An Efficient Parallel Algorithm for Mining Both Frequent Closed and Generator Sequences on Multi-core Processors
    Hai Duong
    Tin Truong
    Bac Le
    PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 154 - 159
  • [39] GenMax: An efficient algorithm for mining maximal frequent itemsets
    Gouda, K
    Zaki, MJ
    DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (03) : 223 - 242
  • [40] An Efficient Algorithm for Mining Frequent Itemsets with Single Constraint
    Hai Duong
    Tin Truong
    Bac Le
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 367 - 378