PaMPa-HD: a Parallel MapReduce-based frequent Pattern miner for High-Dimensional data

被引：13

作者：

Apiletti, Daniele ^{[1
]}

Baralis, Elena ^{[1
]}

Cerquitelli, Tania ^{[1
]}

Garza, Paolo ^{[1
]}

Pulvirenti, Fabio ^{[1
]}

Michiardi, Pietro ^{[2
]}

机构：

[1] Politecn Torino, Dipartimento Automat & Informat, Turin, Italy

[2] Eurecom, Sophia Antipolis, France

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2015年

关键词：

D O I：

10.1109/ICDMW.2015.18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.

引用

页码：839 / 846

页数：8

共 50 条

[1] A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data
Xia, Dawen
Lu, Xiaonan
Li, Huaqing
Wang, Wendong
Li, Yantao
Zhang, Zili
COMPLEXITY, 2018,
[2] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao Yimin
Geng Junhao
Deborah Simon Mwakapesa
Yaser Ahangari Nanehkaran
Zhang Chi
Deng Xiaoheng
Chen Zhigang
Multimedia Systems, 2021, 27 : 709 - 722
[3] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao, Yimin
Geng, Junhao
Mwakapesa, Deborah Simon
Nanehkaran, Yaser Ahangari
Chi, Zhang
Deng, Xiaoheng
Chen, Zhigang
MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
[4] MapReduce-Based Frequent Pattern Mining Framework with Multiple Item Support
Wang, Chen-Shu
Lin, Shiang-Lin
Chang, Jui-Yen
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2017), PT II, 2017, 10192 : 65 - 74
[5] MapReduce-based Parallel Algorithms for Multidimensionnal Data Analysis
Pan, Jie
Magoules, Frederic
Le Biannic, Yann
JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2012, 6 (02) : 325 - 350
[6] Parallel similarity joins on massive high-dimensional data using MapReduce
Ma, Youzhong
Meng, Xiaofeng
Wang, Shaoya
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (01): : 166 - 183
[7] MapReduce-based Parallelized Approximation of Frequent Itemsets Mining in Uncertain Data
Xu, Jing
Mao, Xiao-Jiao
Lu, Wen-Yang
Zhu, Qi-Hai
Li, Ning
Yang, Yu-Bin
NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 136 - 144
[8] PHiDJ: Parallel Similarity Self-Join for High-Dimensional Vector Data with MapReduce
Fries, Sergej
Boden, Brigitte
Stepien, Grzegorz
Seidl, Thomas
2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 796 - 807
[9] Parallel attribute reduction in high-dimensional data: An efficient MapReduce strategy with fuzzy discernibility matrix
Sowkuntla, Pandu
Prasad, P. S. V. S. Sai
APPLIED SOFT COMPUTING, 2025, 172
[10] Data Categorization Using Hadoop MapReduce-Based Parallel K-Means Clustering
Ansari Z.
Afzal A.
Sardar T.H.
Journal of The Institution of Engineers (India): Series B, 2019, 100 (02) : 95 - 103

← 1 2 3 4 5 →