Deep Parallelization of Parallel FP-Growth Using Parent-Child MapReduce

被引:0
|
作者
Makanju, Adetokunbo [1 ]
Farzanyar, Zahra [1 ]
An, Aijun [1 ]
Cercone, Nick [1 ]
Hu, Zane Zhenhua [2 ]
Hu, Yonggang [2 ]
机构
[1] York Univ, EECS Dept, Toronto, ON, Canada
[2] IBM Canada, Markham, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
MapReduce; Frequent Pattern Mining; PFP; FP-Growth; Recursive Divide and Conquer;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce is an important programming model for processing in distributed environments. Compared to other distributed programming models, MapReduce reduces communication overheads between computers and improves fault tolerance. However, the MapReduce model does not allow for automatic synchronization between jobs. A large number of data analytics algorithms use a recursive divide-and-conquer approach, which inherently allows for parallelism at each level of recursion. However, it is often difficult to parallelize such algorithms using the traditional MapReduce model if the process requires synchronization. In this paper we introduce Parent-Child MapReduce, a version of the MapReduce programming model that allows for MapReduce tasks to be created dynamically and synchronized in a hierarchical parent-child fashion. Using the Parallel FPGrowth (PFP) algorithm for mining frequent patterns as a reference, we show that Parent-Child MapReduce can be used to parallelize recursive divide-and-conquer algorithms using the MapReduce model and that this can lead to significant speed ups in the computational speed of such algorithms. Our evaluation shows that we can achieve 68% (or 3 times) performance gain when used with PFP.
引用
收藏
页码:1422 / 1431
页数:10
相关论文
共 50 条
  • [1] Parallelization of FP-growth Algorithm for Mining Probabilistic Numerical Data based on MapReduce
    Pei, Bin
    Wang, Xiuzhen
    Wang, Fenmei
    PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 223 - 226
  • [2] AN OPTIMISED FP-GROWTH ALGORITHM USING MAPREDUCE PARADIGMS
    Wu, Xiuguo
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2021, 22 (10) : 2127 - 2137
  • [3] FP-Growth算法MapReduce化研究
    吕雪骥
    李龙澍
    计算机技术与发展, 2012, 22 (11) : 123 - 126+130
  • [4] The Study of Improved FP-Growth Algorithm in MapReduce
    Sun, Hong
    Zhang, Huaxuan
    Chen, Shiping
    Hu, Chunyan
    PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON CLOUD COMPUTING AND INFORMATION SECURITY (CCIS 2013), 2013, 52 : 250 - 253
  • [5] FP-growth algorithm based on Boolean matrix and MapReduce
    College of Computer, Sichuan University, Chengdu 610065, Sichuan, China
    Huanan Ligong Daxue Xuebao, 1 (135-141):
  • [6] Parallel FP-growth on PC cluster
    Pramudiono, I
    Kitsuregawa, M
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 467 - 473
  • [7] A heuristic approach for load balancing the FP-growth algorithm on MapReduce
    Bagui, Sikha
    Devulapalli, Keerthi
    Coffey, John
    Bagui, Sikha (bagui@uwf.edu), 1600, Elsevier B.V. (07):
  • [8] A Parallel FP-growth Algorithm Based on GPU
    Jiang, Hao
    Meng, He
    2017 IEEE 14TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2017), 2017, : 97 - 102
  • [9] PFP: Parallel FP-Growth for Query Recommendation
    Li, Haoyuan
    Wang, Yi
    Zhang, Dong
    Zhang, Ming
    Chang, Edward Y.
    RECSYS'08: PROCEEDINGS OF THE 2008 ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2008, : 107 - 114
  • [10] 基于布尔矩阵和MapReduce的FP-Growth算法
    陈兴蜀
    张帅
    童浩
    崔晓靖
    华南理工大学学报(自然科学版), 2014, 42 (01) : 135 - 141