PaMPa-HD: a Parallel MapReduce-based frequent Pattern miner for High-Dimensional data

被引：13

作者：

Apiletti, Daniele ^{[1
]}

Baralis, Elena ^{[1
]}

Cerquitelli, Tania ^{[1
]}

Garza, Paolo ^{[1
]}

Pulvirenti, Fabio ^{[1
]}

Michiardi, Pietro ^{[2
]}

机构：

[1] Politecn Torino, Dipartimento Automat & Informat, Turin, Italy

[2] Eurecom, Sophia Antipolis, France

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2015年

关键词：

D O I：

10.1109/ICDMW.2015.18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.

引用

页码：839 / 846

页数：8

共 50 条

[41] Pattern alternating maximization algorithm for missing data in high-dimensional problems
Städler, Nicolas
Stekhoven, Daniel J.
Bühlmann, Peter
Journal of Machine Learning Research, 2014, 15 : 1903 - 1928
[42] Interactive Pattern Discovery in High-Dimensional, Multimodal Data Using Manifolds
Guo, Jinhong K.
Hofmann, Martin O.
COMPLEX ADAPTIVE SYSTEMS CONFERENCE WITH THEME: ENGINEERING CYBER PHYSICAL SYSTEMS, CAS, 2017, 114 : 258 - 265
[43] Spark based Parallel Frequent Pattern Rules for Social Media Data Analytics
Chaturvedi, Shubhangi
Saritha, Sri Khetwat
Chaturvedi, Animesh
2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING WORKSHOPS, CCGRIDW, 2023, : 168 - 175
[44] Differentially Private Top-k Frequent Columns Publication for High-Dimensional Data
Wang, Ning
Wang, Zhigang
Gu, Yu
Xu, Jia
Wei, Zhiqiang
Yu, Ge
IEEE ACCESS, 2019, 7 : 177342 - 177353
[45] CFSBC: Clustering in High-Dimensional Space Based on Closed Frequent Item Set
NI Wei-wei
Wuhan University Journal of Natural Sciences, 2004, (05) : 590 - 594
[46] Data-pattern discovery methods for detection in nongaussian high-dimensional data sets
Levasseur, Cecile
Kreutz-Delgado, Kenneth
Mayer, Uwe
Gancarz, Gregory
2005 39th Asilomar Conference on Signals, Systems and Computers, Vols 1 and 2, 2005, : 545 - 549
[47] High-dimensional data express model based on tensor
Jing, Zhang
XinChang, Guo
Acta Technica CSAV (Ceskoslovensk Akademie Ved), 2017, 62 (01): : 381 - 389
[48] High-dimensional Data Dimension Reduction Based on KECA
Hu, Yongde
Pan, Jingchang
Tan, Xin
SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, PTS 1-4, 2013, 303-306 : 1101 - 1104
[49] Clustering algorithm of high-dimensional data based on units
School of In formation Engineering, Hubei Institute for Nationalities, Enshi 445000, China
Jisuanji Yanjiu yu Fazhan, 2007, 9 (1618-1623): : 1618 - 1623
[50] A hyperplane based indexing technique for high-dimensional data
Wang, Guoren
Zhou, Xiangmin
Wang, Bin
Qiao, Baiyou
Han, Donghong
INFORMATION SCIENCES, 2007, 177 (11) : 2255 - 2268

← 1 2 3 4 5 →