Mining frequent closed itemsets from distributed repositories

被引:1
|
作者
Lucchese, Claudio [1 ]
Orlando, Salvatore [1 ]
Perego, Raffaele [2 ]
Silvestri, Claudio [3 ]
机构
[1] Ca Foscari Univ Venice, Dept Comp Sci, Venice, Italy
[2] CNR, ISTI, HPC Lab, I-56100 Pisa, Italy
[3] Univ Venice, Dept Comp Sci, I-30123 Venice, Italy
关键词
frequent itemsets; closed itemsets; Knowledge Grid; distributed data mining;
D O I
10.1007/978-0-387-37831-2_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we address the problem of mining frequent closed itemsets in a highly distributed setting like a Grid. The extraction of frequent (closed) itemsets is a very expensive phase needed to extract from a transactional database a reduced set of meaningful association rules. We figure out an environment where different datasets are stored in different sites. We assume that, due to the huge size of datasets and privacy concerns, dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining at each site, and then merge local results for deriving global knowledge. This paper shows how frequent closed itemsets, mined independently at each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partitions. To avoid an expensive post-processing phase, needed to compute exact global results, we use a method to approximate the supports of closed itemsets. The approximation is only needed for those globally (closed) frequent itemsets which are locally infrequent on some dataset partitions, and thus are not returned at all from the corresponding sites.
引用
收藏
页码:221 / +
页数:4
相关论文
共 50 条
  • [21] An algorithm for mining frequent closed itemsets with density from data streams
    Dai Caiyan
    Chen Ling
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2016, 12 (2-3) : 146 - 154
  • [22] Mining approximate closed frequent itemsets over stream
    Li, Haifeng
    Lu, Zongjian
    Chen, Hong
    PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 405 - 410
  • [23] Fast Mining of Closed Frequent Itemsets in Data Streams
    Mao Yimin
    Chen Zhigang
    Liu Lixin
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 231 - +
  • [24] Frequent closed itemsets lattice used in data mining
    Cheng, ZH
    Jia, L
    Pei, RQ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1745 - 1748
  • [25] An Efficient Mining Model for Global Frequent Closed Itemsets
    Lin, Jianming
    Ju, Chunhua
    Liu, Dongsheng
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 278 - 282
  • [26] An efficient algorithm for incrementally mining frequent closed itemsets
    Show-Jane Yen
    Yue-Shi Lee
    Chiu-Kuang Wang
    Applied Intelligence, 2014, 40 : 649 - 668
  • [27] Mining frequent closed itemsets without candidate generation
    Chen, K
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2005, 3758 : 668 - 677
  • [28] Fast and memory efficient mining of frequent closed itemsets
    Lucchese, C
    Orlando, S
    Perego, R
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (01) : 21 - 36
  • [29] An efficient algorithm for incrementally mining frequent closed itemsets
    Yen, Show-Jane
    Lee, Yue-Shi
    Wang, Chiu-Kuang
    APPLIED INTELLIGENCE, 2014, 40 (04) : 649 - 668
  • [30] An Optimization to CHARM Algorithm for Mining Frequent Closed Itemsets
    Ye, Xin
    Wei, Feng
    Jiang, Fan
    Cheng, Shaoyin
    CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 226 - 235