PaMPa-HD: a Parallel MapReduce-based frequent Pattern miner for High-Dimensional data

被引:13
|
作者
Apiletti, Daniele [1 ]
Baralis, Elena [1 ]
Cerquitelli, Tania [1 ]
Garza, Paolo [1 ]
Pulvirenti, Fabio [1 ]
Michiardi, Pietro [2 ]
机构
[1] Politecn Torino, Dipartimento Automat & Informat, Turin, Italy
[2] Eurecom, Sophia Antipolis, France
关键词
D O I
10.1109/ICDMW.2015.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [1] A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data
    Xia, Dawen
    Lu, Xiaonan
    Li, Huaqing
    Wang, Wendong
    Li, Yantao
    Zhang, Zili
    COMPLEXITY, 2018,
  • [2] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao Yimin
    Geng Junhao
    Deborah Simon Mwakapesa
    Yaser Ahangari Nanehkaran
    Zhang Chi
    Deng Xiaoheng
    Chen Zhigang
    Multimedia Systems, 2021, 27 : 709 - 722
  • [3] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao, Yimin
    Geng, Junhao
    Mwakapesa, Deborah Simon
    Nanehkaran, Yaser Ahangari
    Chi, Zhang
    Deng, Xiaoheng
    Chen, Zhigang
    MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
  • [4] MapReduce-Based Frequent Pattern Mining Framework with Multiple Item Support
    Wang, Chen-Shu
    Lin, Shiang-Lin
    Chang, Jui-Yen
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2017), PT II, 2017, 10192 : 65 - 74
  • [5] MapReduce-based Parallel Algorithms for Multidimensionnal Data Analysis
    Pan, Jie
    Magoules, Frederic
    Le Biannic, Yann
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2012, 6 (02) : 325 - 350
  • [6] Parallel similarity joins on massive high-dimensional data using MapReduce
    Ma, Youzhong
    Meng, Xiaofeng
    Wang, Shaoya
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (01): : 166 - 183
  • [7] MapReduce-based Parallelized Approximation of Frequent Itemsets Mining in Uncertain Data
    Xu, Jing
    Mao, Xiao-Jiao
    Lu, Wen-Yang
    Zhu, Qi-Hai
    Li, Ning
    Yang, Yu-Bin
    NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 136 - 144
  • [8] PHiDJ: Parallel Similarity Self-Join for High-Dimensional Vector Data with MapReduce
    Fries, Sergej
    Boden, Brigitte
    Stepien, Grzegorz
    Seidl, Thomas
    2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 796 - 807
  • [9] Parallel attribute reduction in high-dimensional data: An efficient MapReduce strategy with fuzzy discernibility matrix
    Sowkuntla, Pandu
    Prasad, P. S. V. S. Sai
    APPLIED SOFT COMPUTING, 2025, 172
  • [10] Data Categorization Using Hadoop MapReduce-Based Parallel K-Means Clustering
    Ansari Z.
    Afzal A.
    Sardar T.H.
    Journal of The Institution of Engineers (India): Series B, 2019, 100 (02) : 95 - 103