A Generalized Parallel Algorithm for Frequent Itemset Mining

被引:0
|
作者
Craus, Mitica [1 ]
Archip, Alexandru [1 ]
机构
[1] Gh Asachi Tech Univ, Dept Comp Engn, 53A D Mangeron St, Iasi 700050, Romania
关键词
Data Mining; Association Rule Discovery; Frequent Itemset Mining; Parallel Algorithms;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A parallel algorithm for finding the frequent itemsets in a set of transactions is presented. The frequent individual items are identified by their index. We assume that processors number (m) is less than the frequent items number (n). At the first stage, every processor P-i, i epsilon {1,..., m - 1} sequentially computes the frequent itemsets from the interval I-i = [(i - 1) . p + 1, i . p], where p = [n/m]. The processor P-m computes frequent M itemsets from the interval I-m = [(m - 1) . p + 1, n]. In the second stage, the parallel algorithm is applied. The processor P-i computes, step by step, the sets F-Ii,F-Ij of the frequent itemsets with individual items from the intervals I-i,I-j = I-i boolean OR Ii+1 boolean OR....boolean OR I-j,I- j = i + 1,...m. In order to compute the set F-Ii,F-Ij, the processor P-i uses F-Ii,F-Ij-1 obtained in the previous step and F-Ii+1,F-Ij received from the processor Pi+1. The main advantage of our parallel algorithm is that it uses a communication pattern known before algorithm start, which permits to map the communication to hardware. Another major advantage is that the set of the transactions can be distributed to processors before the beginning of the algorithm. This is possible because a processor Pi has to compute F-Ii,F-Ij,F-j = i + 1,..., m and therefore only the transactions containing the frequent items starting with I-i are needed.
引用
收藏
页码:520 / +
页数:2
相关论文
共 50 条
  • [21] Parallel Incremental Frequent Itemset Mining for Large Data
    Song, Yu-Geng
    Cui, Hui-Min
    Feng, Xiao-Bing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (02) : 368 - 385
  • [22] Parallel and distributed frequent itemset mining on dynamic datasets
    Veloso, A
    Otey, ME
    Parthasarathy, S
    Meira, W
    HIGH PERFORMANCE COMPUTING - HIPC 2003, 2003, 2913 : 184 - 193
  • [23] Towards a Verified Parallel Implementation of Frequent Itemset Mining
    Whitney, Christopher D.
    Loulergue, Fre de Ric
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 889 - 890
  • [24] Parallel frequent itemset mining using systolic arrays
    Sohrabi, Mohammad Karim
    Barforoush, Ahmad Abdollahzadeh
    KNOWLEDGE-BASED SYSTEMS, 2013, 37 : 462 - 471
  • [25] Asynchronous and anticipatory filter-stream based parallel algorithm for frequent itemset mining
    Veloso, A
    Meira, W
    Ferreira, R
    Neto, DG
    Parthasarathy, S
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 422 - 433
  • [26] Parallel Incremental Frequent Itemset Mining for Large Data
    Yu-Geng Song
    Hui-Min Cui
    Xiao-Bing Feng
    Journal of Computer Science and Technology, 2017, 32 : 368 - 385
  • [27] Fast Mining Algorithm of Frequent Itemset Based on Spark
    Ding J.-M.
    Li H.-B.
    Deng B.
    Jia L.-Y.
    You J.-G.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2446 - 2464
  • [28] An efficient algorithm for frequent itemset mining on data streams
    Xie Zhi-Jun
    Chen Hong
    Li, Cuiping
    ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 474 - 491
  • [29] Novel algorithm for frequent itemset mining in data warehouses
    Xu L.-J.
    Xie K.-L.
    Journal of Zhejiang University-SCIENCE A, 2006, 7 (2): : 216 - 224
  • [30] Frequent Itemset Mining Algorithm based on Sampling Method
    Li, Haifeng
    Zhang, Ning
    Zhang, Yuejin
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 852 - 855