A Generalized Parallel Algorithm for Frequent Itemset Mining

被引：0

作者：

Craus, Mitica ^{[1
]}

Archip, Alexandru ^{[1
]}

机构：

[1] Gh Asachi Tech Univ, Dept Comp Engn, 53A D Mangeron St, Iasi 700050, Romania

来源：

PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS | 2008年

关键词：

Data Mining; Association Rule Discovery; Frequent Itemset Mining; Parallel Algorithms;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

A parallel algorithm for finding the frequent itemsets in a set of transactions is presented. The frequent individual items are identified by their index. We assume that processors number (m) is less than the frequent items number (n). At the first stage, every processor P-i, i epsilon {1,..., m - 1} sequentially computes the frequent itemsets from the interval I-i = [(i - 1) . p + 1, i . p], where p = [n/m]. The processor P-m computes frequent M itemsets from the interval I-m = [(m - 1) . p + 1, n]. In the second stage, the parallel algorithm is applied. The processor P-i computes, step by step, the sets F-Ii,F-Ij of the frequent itemsets with individual items from the intervals I-i,I-j = I-i boolean OR Ii+1 boolean OR....boolean OR I-j,I- j = i + 1,...m. In order to compute the set F-Ii,F-Ij, the processor P-i uses F-Ii,F-Ij-1 obtained in the previous step and F-Ii+1,F-Ij received from the processor Pi+1. The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start, which permits to map the communication to hardware. Another major advantage is that the set of the transactions can be distributed to processors before the beginning of the algorithm. This is possible because a processor Pi has to compute F-Ii,F-Ij,F-j = i + 1,..., m and therefore only the transactions containing the frequent items starting with I-i are needed.

引用

页码：520 / +

页数：2

共 50 条

[21] Parallel Incremental Frequent Itemset Mining for Large Data
Song, Yu-Geng
Cui, Hui-Min
Feng, Xiao-Bing
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (02) : 368 - 385
[22] Parallel and distributed frequent itemset mining on dynamic datasets
Veloso, A
Otey, ME
Parthasarathy, S
Meira, W
HIGH PERFORMANCE COMPUTING - HIPC 2003, 2003, 2913 : 184 - 193
[23] Towards a Verified Parallel Implementation of Frequent Itemset Mining
Whitney, Christopher D.
Loulergue, Fre de Ric
2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 889 - 890
[24] Parallel frequent itemset mining using systolic arrays
Sohrabi, Mohammad Karim
Barforoush, Ahmad Abdollahzadeh
KNOWLEDGE-BASED SYSTEMS, 2013, 37 : 462 - 471
[25] Asynchronous and anticipatory filter-stream based parallel algorithm for frequent itemset mining
Veloso, A
Meira, W
Ferreira, R
Neto, DG
Parthasarathy, S
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 422 - 433
[26] Parallel Incremental Frequent Itemset Mining for Large Data
Yu-Geng Song
Hui-Min Cui
Xiao-Bing Feng
Journal of Computer Science and Technology, 2017, 32 : 368 - 385
[27] Fast Mining Algorithm of Frequent Itemset Based on Spark
Ding J.-M.
Li H.-B.
Deng B.
Jia L.-Y.
You J.-G.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2446 - 2464
[28] An efficient algorithm for frequent itemset mining on data streams
Xie Zhi-Jun
Chen Hong
Li, Cuiping
ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 474 - 491
[29] Novel algorithm for frequent itemset mining in data warehouses
Xu L.-J.
Xie K.-L.
Journal of Zhejiang University-SCIENCE A, 2006, 7 (2): : 216 - 224
[30] Frequent Itemset Mining Algorithm based on Sampling Method
Li, Haifeng
Zhang, Ning
Zhang, Yuejin
PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 852 - 855

← 1 2 3 4 5 →