Incremental update on sequential patterns in large databases by implicit merging and efficient counting

被引:29
|
作者
Lin, MY [1 ]
Lee, SY [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci & Informat Engn, Hsinchu 30050, Taiwan
关键词
data mining; sequential patterns; incremental update; sequence discovery; sequence merging;
D O I
10.1016/S0306-4379(03)00036-X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current approaches for sequential pattern mining usually assume that the mining is performed in a static sequence database. However, databases are not static due to update so that the discovered patterns might become invalid and new patterns could be created. In addition to higher complexity, the maintenance of sequential patterns is more challenging than that of association rules owing to sequence merging. Sequence merging, which is unique in sequence databases, requires the appended new sequences to be merged with the existing ones if their customer ids are the same. Re-mining of the whole database appears to be inevitable since the information collected in previous discovery will be corrupted by sequence merging. Instead of re-mining, the proposed IncSP (Incremental Sequential Pattern Update) algorithm solves the maintenance problem through effective implicit merging and efficient separate counting over appended sequences. Patterns found previously are incrementally updated rather than re-mined from scratch. Moreover, the technique of early candidate pruning further speeds up the discovery of new patterns. Empirical evaluation using comprehensive synthetic data shows that IncSP is fast and scalable. (C) 2003 Elsevier Ltd. All rights reserved.
引用
收藏
页码:385 / 404
页数:20
相关论文
共 50 条
  • [31] Efficient approach for mining high-utility patterns on incremental databases with dynamic profits
    Kim, Sinyoung
    Kim, Hanju
    Cho, Myungha
    Kim, Hyeonmo
    Vo, Bay
    Lin, Jerry Chun-Wei
    Yun, Unil
    KNOWLEDGE-BASED SYSTEMS, 2023, 282
  • [32] An efficient algorithm for mining association rules for large itemsets in large databases: from sequential to parallel
    Wong, AKY
    Wu, SL
    Feng, L
    ENGINEERING INTELLIGENT SYSTEMS FOR ELECTRICAL ENGINEERING AND COMMUNICATIONS, 2000, 8 (02): : 109 - 117
  • [33] Efficient mining for structurally diverse subgraph patterns in large molecular databases
    Maunz, Andreas
    Helma, Christoph
    Kramer, Stefan
    MACHINE LEARNING, 2011, 83 (02) : 193 - 218
  • [34] Efficient mining for structurally diverse subgraph patterns in large molecular databases
    Andreas Maunz
    Christoph Helma
    Stefan Kramer
    Machine Learning, 2011, 83 : 193 - 218
  • [35] Efficient discovery of periodic-frequent patterns in very large databases
    Kiran, R. Uday
    Kitsuregawa, Masaru
    Reddy, P. Krishna
    JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 112 : 110 - 121
  • [36] An efficient algorithm for mining high utility patterns from incremental databases with one database scan
    Yun, Unil
    Ryang, Heungmo
    Lee, Gangin
    Fujita, Hamido
    KNOWLEDGE-BASED SYSTEMS, 2017, 124 : 188 - 206
  • [37] Towards Efficient Discovery of Spatially Interesting Patterns in Geo-referenced Sequential Databases
    Suzuki, Shota
    Kiran, Rage Uday
    35TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2023, 2023,
  • [38] Efficient Mining of High Average-Utility Sequential Patterns from Uncertain Databases
    Lin, Jerry Chun-Wei
    Wu, Jimmy Ming-Tai
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Li, Ting
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 1989 - 1994
  • [39] Discovery of Direct and Indirect Sequential Patterns with Multiple Minimum Supports in Large Transaction Databases
    Ouyang, Weimin
    Huang, Qinhua
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT I, 2011, 7002 : 396 - 403
  • [40] Mining Positive and Negative Fuzzy Multiple Level Sequential Patterns in Large Transaction Databases
    Ouyang, Weimin
    Huang, Qinhua
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL I, 2009, : 500 - 504