Mining frequent closed itemsets from distributed repositories

被引:1
|
作者
Lucchese, Claudio [1 ]
Orlando, Salvatore [1 ]
Perego, Raffaele [2 ]
Silvestri, Claudio [3 ]
机构
[1] Ca Foscari Univ Venice, Dept Comp Sci, Venice, Italy
[2] CNR, ISTI, HPC Lab, I-56100 Pisa, Italy
[3] Univ Venice, Dept Comp Sci, I-30123 Venice, Italy
关键词
frequent itemsets; closed itemsets; Knowledge Grid; distributed data mining;
D O I
10.1007/978-0-387-37831-2_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we address the problem of mining frequent closed itemsets in a highly distributed setting like a Grid. The extraction of frequent (closed) itemsets is a very expensive phase needed to extract from a transactional database a reduced set of meaningful association rules. We figure out an environment where different datasets are stored in different sites. We assume that, due to the huge size of datasets and privacy concerns, dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining at each site, and then merge local results for deriving global knowledge. This paper shows how frequent closed itemsets, mined independently at each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partitions. To avoid an expensive post-processing phase, needed to compute exact global results, we use a method to approximate the supports of closed itemsets. The approximation is only needed for those globally (closed) frequent itemsets which are locally infrequent on some dataset partitions, and thus are not returned at all from the corresponding sites.
引用
收藏
页码:221 / +
页数:4
相关论文
共 50 条
  • [31] NECLATCLOSED: A vertical algorithm for mining frequent closed itemsets
    Aryabarzan, Nader
    Minaei-Bidgoli, Behrouz
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
  • [32] An Algorithm for Mining Frequent Closed Itemsets in Data Stream
    Dai, Caiyan
    Chen, Ling
    INTERNATIONAL CONFERENCE ON APPLIED PHYSICS AND INDUSTRIAL ENGINEERING 2012, PT C, 2012, 24 : 1722 - 1728
  • [33] An Improved MapReduce Algorithm for Mining Closed Frequent Itemsets
    Gonen, Yaron
    Gudes, Ehud
    2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SCIENCE, TECHNOLOGY AND ENGINEERING (SWSTE 2016), 2016, : 77 - 83
  • [34] Mining Probabilistic Frequent Closed Itemsets in Uncertain Databases
    Tang, Peiyi
    Peterson, Erich A.
    PROCEEDINGS OF THE 49TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE (ACMSE '11), 2011, : 86 - 91
  • [35] Mining frequent closed itemsets using conditional frequent pattern tree
    Singh, SR
    Patra, BK
    Giri, D
    Proceedings of the IEEE INDICON 2004, 2004, : 501 - 504
  • [36] Mining association rules with frequent closed itemsets lattice
    Jia, L
    Yao, J
    Pei, RQ
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2003, 2773 : 469 - 475
  • [37] A new technique for fast frequent closed itemsets mining
    Ning, L
    Wu, NN
    Zhang, J
    INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOL 1-4, PROCEEDINGS, 2005, : 3640 - 3647
  • [38] Mining frequent closed itemsets with one database scanning
    Qiu, Yong
    Lan, Yong-Jie
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1326 - +
  • [39] An Algorithm for Mining Frequent Closed Itemsets in Data Stream
    Dai, Caiyan
    Chen, Ling
    2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL I, 2010, : 281 - 284
  • [40] Distributed mining of maximal frequent itemsets from Databases on a cluster of workstations
    Chung, SM
    Luo, CN
    2004 IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID - CCGRID 2004, 2004, : 499 - 507