Mining frequent closed itemsets from distributed repositories

被引:1
|
作者
Lucchese, Claudio [1 ]
Orlando, Salvatore [1 ]
Perego, Raffaele [2 ]
Silvestri, Claudio [3 ]
机构
[1] Ca Foscari Univ Venice, Dept Comp Sci, Venice, Italy
[2] CNR, ISTI, HPC Lab, I-56100 Pisa, Italy
[3] Univ Venice, Dept Comp Sci, I-30123 Venice, Italy
关键词
frequent itemsets; closed itemsets; Knowledge Grid; distributed data mining;
D O I
10.1007/978-0-387-37831-2_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we address the problem of mining frequent closed itemsets in a highly distributed setting like a Grid. The extraction of frequent (closed) itemsets is a very expensive phase needed to extract from a transactional database a reduced set of meaningful association rules. We figure out an environment where different datasets are stored in different sites. We assume that, due to the huge size of datasets and privacy concerns, dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining at each site, and then merge local results for deriving global knowledge. This paper shows how frequent closed itemsets, mined independently at each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partitions. To avoid an expensive post-processing phase, needed to compute exact global results, we use a method to approximate the supports of closed itemsets. The approximation is only needed for those globally (closed) frequent itemsets which are locally infrequent on some dataset partitions, and thus are not returned at all from the corresponding sites.
引用
收藏
页码:221 / +
页数:4
相关论文
共 50 条
  • [1] Distributed Frequent Closed Itemsets Mining
    Liu, Chun
    Zheng, Zheng
    Cai, Kai-Yuan
    Zhang, Shichao
    SITIS 2007: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGIES & INTERNET BASED SYSTEMS, 2008, : 43 - 50
  • [2] Mining Frequent Closed Itemsets from Distributed Dataset
    Ju, Chunhua
    Ni, Dongjun
    PROCEEDINGS OF THE 2008 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN, VOL 1, 2008, : 37 - 41
  • [3] NUCLEAR: An Efficient Methods for Mining Frequent Itemsets and Generators from Closed Frequent Itemsets
    Huy Quang Pham
    Duc Tran
    Ninh Bao Duong
    Fournier-Viger, Philippe
    Alioune Ngom
    INFORMATION TECHNOLOGY IN INDUSTRY, 2019, 7 (02): : 1 - 13
  • [4] Mining Frequent and Homogeneous Closed Itemsets
    Hilali, Ines
    Jen, Tao-Yuan
    Laurent, Dominique
    Marinica, Claudia
    Ben Yahia, Sadok
    INFORMATION SEARCH, INTEGRATION AND PERSONALIZATION, ISIP 2014, 2016, 497 : 51 - 65
  • [5] Mining Frequent Weighted Closed Itemsets
    Bay Vo
    Nhu-Y Tran
    Duong-Ha Ngo
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 379 - 390
  • [6] An Algorithm for Mining Frequent Closed Itemsets
    Zhang Tiejun
    Yang Junrui
    Wang Xiuqin
    2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 240 - +
  • [7] An Algorithm of Mining Closed Frequent Itemsets
    Li, Haifeng
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 95 - 98
  • [8] Mining frequent closed itemsets with the frequent pattern list
    Tseng, FC
    Hsu, CC
    Chen, H
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 653 - 654
  • [9] Research on an algorithm for mining frequent closed itemsets
    Zhu, Yuquan
    Song, Yuqing
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2007, 44 (07): : 1177 - 1183
  • [10] Frequent closed itemsets mining using ITBitree
    Ren, Jiadong
    Song, Wei
    Yu, Shiying
    International Journal of Advancements in Computing Technology, 2012, 4 (17) : 271 - 279