A Generalized Parallel Algorithm for Frequent Itemset Mining

被引：0

作者：

Craus, Mitica ^{[1
]}

Archip, Alexandru ^{[1
]}

机构：

[1] Gh Asachi Tech Univ, Dept Comp Engn, 53A D Mangeron St, Iasi 700050, Romania

来源：

PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS | 2008年

关键词：

Data Mining; Association Rule Discovery; Frequent Itemset Mining; Parallel Algorithms;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

A parallel algorithm for finding the frequent itemsets in a set of transactions is presented. The frequent individual items are identified by their index. We assume that processors number (m) is less than the frequent items number (n). At the first stage, every processor P-i, i epsilon {1,..., m - 1} sequentially computes the frequent itemsets from the interval I-i = [(i - 1) . p + 1, i . p], where p = [n/m]. The processor P-m computes frequent M itemsets from the interval I-m = [(m - 1) . p + 1, n]. In the second stage, the parallel algorithm is applied. The processor P-i computes, step by step, the sets F-Ii,F-Ij of the frequent itemsets with individual items from the intervals I-i,I-j = I-i boolean OR Ii+1 boolean OR....boolean OR I-j,I- j = i + 1,...m. In order to compute the set F-Ii,F-Ij, the processor P-i uses F-Ii,F-Ij-1 obtained in the previous step and F-Ii+1,F-Ij received from the processor Pi+1. The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start, which permits to map the communication to hardware. Another major advantage is that the set of the transactions can be distributed to processors before the beginning of the algorithm. This is possible because a processor Pi has to compute F-Ii,F-Ij,F-j = i + 1,..., m and therefore only the transactions containing the frequent items starting with I-i are needed.

引用

页码：520 / +

页数：2

共 50 条

[1] A parallel algorithm for frequent itemset mining
Li, L
Zhai, DH
Fan, J
PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, : 868 - 871
[2] A Highly Parallel Algorithm for Frequent Itemset Mining
Mesa, Alejandro
Feregrino-Uribe, Claudia
Cumplido, Rene
Hernandez-Palancar, Jose
ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 291 - +
[3] YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark
Qiu, Hongjian
Gu, Rong
Yuan, Chunfeng
Huang, Yihua
PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1664 - 1671
[4] A New Parallel Algorithm for the Frequent Itemset Mining Problem
Craus, Mitica
PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, 2008, : 165 - 170
[5] A novel parallel frequent itemset mining algorithm for automatic enterprise
Mao, Yimin
Wu, Bin
Deng, Qianhu
Mahmoodi, Soroosh
Chen, Zhigang
Chen, Yeh-Cheng
ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (10)
[6] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
Fumarola, Fabio
Malerba, Donato
2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
[7] A Novel Parallel Algorithm for Frequent Itemset Mining of Incremental Dataset
Xu, Lijun
Zhang, Yun
2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING ICISCE 2015, 2015, : 41 - 44
[8] Frequent itemset mining with parallel RDBMS
Shang, XQ
Sattler, KU
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 539 - 544
[9] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao, Yimin
Geng, Junhao
Mwakapesa, Deborah Simon
Nanehkaran, Yaser Ahangari
Chi, Zhang
Deng, Xiaoheng
Chen, Zhigang
MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
[10] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao Yimin
Geng Junhao
Deborah Simon Mwakapesa
Yaser Ahangari Nanehkaran
Zhang Chi
Deng Xiaoheng
Chen Zhigang
Multimedia Systems, 2021, 27 : 709 - 722

← 1 2 3 4 5 →