Incremental update on sequential patterns in large databases by implicit merging and efficient counting

被引:29
|
作者
Lin, MY [1 ]
Lee, SY [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci & Informat Engn, Hsinchu 30050, Taiwan
关键词
data mining; sequential patterns; incremental update; sequence discovery; sequence merging;
D O I
10.1016/S0306-4379(03)00036-X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current approaches for sequential pattern mining usually assume that the mining is performed in a static sequence database. However, databases are not static due to update so that the discovered patterns might become invalid and new patterns could be created. In addition to higher complexity, the maintenance of sequential patterns is more challenging than that of association rules owing to sequence merging. Sequence merging, which is unique in sequence databases, requires the appended new sequences to be merged with the existing ones if their customer ids are the same. Re-mining of the whole database appears to be inevitable since the information collected in previous discovery will be corrupted by sequence merging. Instead of re-mining, the proposed IncSP (Incremental Sequential Pattern Update) algorithm solves the maintenance problem through effective implicit merging and efficient separate counting over appended sequences. Patterns found previously are incrementally updated rather than re-mined from scratch. Moreover, the technique of early candidate pruning further speeds up the discovery of new patterns. Empirical evaluation using comprehensive synthetic data shows that IncSP is fast and scalable. (C) 2003 Elsevier Ltd. All rights reserved.
引用
收藏
页码:385 / 404
页数:20
相关论文
共 50 条
  • [41] Fast discovery of sequential patterns in large databases using effective time-indexing
    Lin, Ming-Yen
    Hsueh, Sue-Chen
    Chang, Chia-Wen
    INFORMATION SCIENCES, 2008, 178 (22) : 4228 - 4245
  • [42] Mining Direct and Indirect Fuzzy Multiple Level Sequential Patterns in Large Transaction Databases
    Ouyang, Weimin
    Huang, Qinhua
    Luo, Shuanghu
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2008, 5227 : 906 - +
  • [43] An incremental mining algorithm for maintaining sequential patterns using pre-large sequences
    Hong, Tzung-Pei
    Wang, Ching-Yao
    Tseng, Shian-Shyong
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (06) : 7051 - 7058
  • [44] Incremental mining algorithms for generating and updating frequent patterns for dynamic databases against insert, update, and support changes
    Borra, Sivaiah
    Rao, R. Rajeswara
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [45] Efficient discovery of optimal word-association patterns in large text databases
    Shimozono, S
    Arimura, H
    Arikawa, S
    NEW GENERATION COMPUTING, 2000, 18 (01) : 49 - 60
  • [46] An efficient tool for discovering simple combinatorial patterns from large text databases
    Arimura, H
    Wataki, A
    Fujino, R
    Shimozono, S
    Arikawa, S
    DISCOVERY SCIENCE, 1998, 1532 : 393 - 394
  • [47] Efficient discovery of optimal word-association patterns in large text databases
    Shinichi Shimozono
    Hiroki Arimura
    Setsuo Arikawa
    New Generation Computing, 2000, 18 : 49 - 60
  • [48] Methods for the Efficient Discovery of Large Item-Indexable Sequential Patterns
    Henriques, Rui
    Antunes, Claudia
    Madeira, Sara C.
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, NFMCP 2013, 2014, 8399 : 100 - 116
  • [49] EFFICIENT MINING OF CLOSED TREE PATTERNS FROM LARGE TREE DATABASES WITH SUBTREE CONSTRAINT
    Viet Anh Nguyen
    Doi, Koichiro
    Yamamoto, Akihiro
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2012, 21 (06)
  • [50] Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases
    Huang, Tony Cheng-Kui
    FUZZY SETS AND SYSTEMS, 2009, 160 (23) : 3359 - 3381