A Generalized Parallel Algorithm for Frequent Itemset Mining

被引:0
|
作者
Craus, Mitica [1 ]
Archip, Alexandru [1 ]
机构
[1] Gh Asachi Tech Univ, Dept Comp Engn, 53A D Mangeron St, Iasi 700050, Romania
关键词
Data Mining; Association Rule Discovery; Frequent Itemset Mining; Parallel Algorithms;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A parallel algorithm for finding the frequent itemsets in a set of transactions is presented. The frequent individual items are identified by their index. We assume that processors number (m) is less than the frequent items number (n). At the first stage, every processor P-i, i epsilon {1,..., m - 1} sequentially computes the frequent itemsets from the interval I-i = [(i - 1) . p + 1, i . p], where p = [n/m]. The processor P-m computes frequent M itemsets from the interval I-m = [(m - 1) . p + 1, n]. In the second stage, the parallel algorithm is applied. The processor P-i computes, step by step, the sets F-Ii,F-Ij of the frequent itemsets with individual items from the intervals I-i,I-j = I-i boolean OR Ii+1 boolean OR....boolean OR I-j,I- j = i + 1,...m. In order to compute the set F-Ii,F-Ij, the processor P-i uses F-Ii,F-Ij-1 obtained in the previous step and F-Ii+1,F-Ij received from the processor Pi+1. The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start, which permits to map the communication to hardware. Another major advantage is that the set of the transactions can be distributed to processors before the beginning of the algorithm. This is possible because a processor Pi has to compute F-Ii,F-Ij,F-j = i + 1,..., m and therefore only the transactions containing the frequent items starting with I-i are needed.
引用
收藏
页码:520 / +
页数:2
相关论文
共 50 条
  • [1] A parallel algorithm for frequent itemset mining
    Li, L
    Zhai, DH
    Fan, J
    PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, : 868 - 871
  • [2] A Highly Parallel Algorithm for Frequent Itemset Mining
    Mesa, Alejandro
    Feregrino-Uribe, Claudia
    Cumplido, Rene
    Hernandez-Palancar, Jose
    ADVANCES IN PATTERN RECOGNITION, 2010, 6256 : 291 - +
  • [3] YAFIM: A Parallel Frequent Itemset Mining Algorithm with Spark
    Qiu, Hongjian
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1664 - 1671
  • [4] A New Parallel Algorithm for the Frequent Itemset Mining Problem
    Craus, Mitica
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, 2008, : 165 - 170
  • [5] A novel parallel frequent itemset mining algorithm for automatic enterprise
    Mao, Yimin
    Wu, Bin
    Deng, Qianhu
    Mahmoodi, Soroosh
    Chen, Zhigang
    Chen, Yeh-Cheng
    ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (10)
  • [6] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [7] A Novel Parallel Algorithm for Frequent Itemset Mining of Incremental Dataset
    Xu, Lijun
    Zhang, Yun
    2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING ICISCE 2015, 2015, : 41 - 44
  • [8] Frequent itemset mining with parallel RDBMS
    Shang, XQ
    Sattler, KU
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 539 - 544
  • [9] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao, Yimin
    Geng, Junhao
    Mwakapesa, Deborah Simon
    Nanehkaran, Yaser Ahangari
    Chi, Zhang
    Deng, Xiaoheng
    Chen, Zhigang
    MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
  • [10] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao Yimin
    Geng Junhao
    Deborah Simon Mwakapesa
    Yaser Ahangari Nanehkaran
    Zhang Chi
    Deng Xiaoheng
    Chen Zhigang
    Multimedia Systems, 2021, 27 : 709 - 722