Effective database transformation and efficient support computation for mining sequential patterns

被引：0

作者：

Cho, Chung-Wen ^{[2
]}

Wu, Yi-Hung ^{[3
]}

Chen, Arbee L. P. ^{[1
]}

机构：

[1] Natl Chengchi Univ, Dept Comp Sci, Taipei, Taiwan

[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 30043, Taiwan

[3] Chung Yuan Christian Univ, Dept Informat & Comp Engn, Jhongli, Taiwan

来源：

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS | 2009年 / 32卷 / 01期

关键词：

Data mining; Sequential patterns; Database transformation; Support computation; Database projection; ALGORITHM;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a novel algorithm for mining frequent sequences from transaction databases. The transactions of the same customers form a set of customer sequences. A sequence (an ordered list of itemsets) is frequent if the number of customer sequences containing it satisfies the user-specified threshold. The 1-sequence is a special type of sequences because it consists of only a single itemset instead of an ordered list, while the k-sequence is a sequence composed of k itemsets. Compared with the cost of mining frequent k-sequences (k a parts per thousand yenaEuro parts per thousand 2), the cost of mining frequent 1-sequences is negligible. We adopt a two-phase architecture to find the two types of frequent sequences separately in order that the discovery of frequent k-sequences can be well designed and optimized. For efficient frequent k-sequence mining, every frequent 1-sequence is encoded as a unique symbol and the database is transformed into one constituted by the symbols. We find that it is unnecessary to encode all the frequent 1-seqences, and make full use of the discovered frequent 1-sequences to transform the database into one with a smaller size. For every k a parts per thousand yenaEuro parts per thousand 2, the customer sequences in the transformed database are scanned to find all the frequent k-sequences. We devise the compact representation for a customer sequence and elaborate the method to enumerate all distinct subsequences from a customer sequence without redundant scans. The soundness of the proposed approach is verified and a number of experiments are performed. The results show that our approach outperforms the previous works in both scalability and execution time.

引用

页码：23 / 51

页数：29

共 50 条

[1] Effective database transformation and efficient support computation for mining sequential patterns
Cho, CW
Wu, YH
Chen, ALP
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 163 - 174
[2] Effective database transformation and efficient support computation for mining sequential patterns
Chung-Wen Cho
Yi-Hung Wu
Arbee L. P. Chen
Journal of Intelligent Information Systems, 2009, 32 (1) : 23 - 51
[3] Database support for data mining patterns
Kotsifakos, E
Ntoutsi, I
Theodoridis, Y
ADVANCES IN INFORMATICS, PROCEEDINGS, 2005, 3746 : 14 - 24
[4] An Effective Approach for Mining Weighted Sequential Patterns
Patel, Mukesh
Modi, Nilesh
Passi, Kalpdrum
SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 904 - 915
[5] Approximate sequential patterns for incomplete sequence database mining
Fiot, Celine
Laurent, Anne
Teisseire, Maguelonne
2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 663 - 668
[6] An efficient algorithm for incremental mining of sequential patterns
Ren, Jia-Dong
Zhou, Xiao-Lei
ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 179 - 188
[7] A high efficient algorithm of mining sequential patterns
Qin, F
Yang, XB
PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 3750 - 3752
[8] Efficient Mining of Outlying Sequential Behavior Patterns
Xu, Yifan
Duan, Lei
Xie, Guicai
Fu, Min
Li, Longhai
Nummenmaa, Jyrki
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 325 - 341
[9] An efficient method for mining sequential patterns with indices
Huynh, Huy Minh
Nguyen, Loan T. T.
Pham, Nam Ngoc
Oplatkova, Zuzana Kominkova
Yun, Unil
Vo, Bay
KNOWLEDGE-BASED SYSTEMS, 2022, 239
[10] NSPIS: Mining Negative Sequential Patterns with Individual Support
Huang, Gengsen
Gan, Wensheng
Huang, Shan
Chen, Jiahui
Chen, Chien-Ming
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5507 - 5516

← 1 2 3 4 5 →