An efficient parallel row enumerated algorithm for mining frequent colossal closed itemsets from high dimensional datasets

被引:12
|
作者
Vanahalli, Manjunath K. [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore 575025, India
关键词
Bioinformatics; High dimensional datasets; Parallel preprocessing; Parallel algorithm; Colossal closed itemsets; Rowset cardinality table; PATTERNS;
D O I
10.1016/j.ins.2018.08.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining colossal itemsets from high dimensional datasets have gained focus in recent times. The conventional algorithms expend most of the time in mining small and mid-sized item sets, which do not enclose valuable and complete information for decision making. Mining Frequent Colossal Closed Itemsets (FCCI) from a high dimensional dataset play a highly significant role in decision making for many applications, especially in the field of bioinformatics. To mine FCCI from a high dimensional dataset, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. Besides, the state-of-the-art algorithms for the same are sequential and computationally expensive. The proposed work highlights an Effective Improved Parallel Preprocessing (EIPP) technique to prune the complete set of irrelevant features and irrelevant rows from high dimensional dataset and a novel efficient Parallel Frequent Colossal Closed Itemset Mining (PFCCIM) algorithm. Further, the PFCCIM algorithm is integrated with a novel Rowset Cardinality Table (RCT), an efficient method to check the closeness of a rowset and also an efficient pruning strategy to cut down the mining search space. The proposed PFCCIM algorithm is the first parallel algorithm to mine FCCI from a high dimensional dataset. The performance study shows the improved effectiveness of the proposed EIPP technique over the existing preprocessing techniques and the improved efficiency of the proposed PFCCIM algorithm over the existing algorithms. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:343 / 362
页数:20
相关论文
共 50 条
  • [21] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Yu, Shao-jun
    2011 SECOND INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND EDUCATION APPLICATION (ICEA 2011), 2011, : 197 - 201
  • [22] Efficient Data Streams Based Closed Frequent Itemsets Mining Algorithm
    Tan, Jun
    ADVANCES IN CIVIL ENGINEERING II, PTS 1-4, 2013, 256-259 : 2910 - 2913
  • [23] Research on an algorithm for mining frequent closed itemsets
    Zhu, Yuquan
    Song, Yuqing
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2007, 44 (07): : 1177 - 1183
  • [24] Improved algorithm for mining frequent closed itemsets
    Song, Wei
    Yang, Bingru
    Xu, Zhangyan
    Gao, Jing
    2008, Science Press, 18,Shuangqing Street,Haidian, Beijing, 100085, China (45):
  • [25] New algorithm of mining frequent closed itemsets
    School of Computer and Information Technology, Liaoning Normal University, Dalian 116029, China
    J. Southeast Univ. Engl. Ed., 2008, 3 (335-338):
  • [26] Mining Closed Colossal Frequent Patterns from High-Dimensional Dataset: Serial Versus Parallel Framework
    Sureshan, Sudeep
    Penumacha, Anusha
    Jain, Siddharth
    Vanahalli, Manjunath
    Patil, Nagamma
    PROGRESS IN INTELLIGENT COMPUTING TECHNIQUES: THEORY, PRACTICE, AND APPLICATIONS, VOL 1, 2018, 518 : 317 - 326
  • [27] MREclat: an Algorithm for Parallel Mining Frequent Itemsets
    Zhang, Zhigang
    Ji, Genlin
    Tang, Mengmeng
    2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 177 - 180
  • [28] A fast parallel algorithm for frequent itemsets mining
    Souliou, Dora
    Pagourtzis, Aris
    Tsanakas, Panayiotis
    ARTIFICIAL INTELLIGENCE AND INNOVATIONS 2007: FROM THEORY TO APPLICATIONS, 2007, : 213 - +
  • [29] A parallel Apriori algorithm for frequent itemsets mining
    Ye, Yanbin
    Chiang, Chia-Chu
    FOURTH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS, PROCEEDINGS, 2006, : 87 - +
  • [30] FCHUIM: Efficient Frequent and Closed High-Utility Itemsets Mining
    Wei, Tianyou
    Wang, Bin
    Zhang, Yuntian
    Hu, Keyong
    Yao, Yinfeng
    Liu, Hao
    IEEE ACCESS, 2020, 8 : 109928 - 109939