An efficient parallel row enumerated algorithm for mining frequent colossal closed itemsets from high dimensional datasets

被引:12
|
作者
Vanahalli, Manjunath K. [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore 575025, India
关键词
Bioinformatics; High dimensional datasets; Parallel preprocessing; Parallel algorithm; Colossal closed itemsets; Rowset cardinality table; PATTERNS;
D O I
10.1016/j.ins.2018.08.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining colossal itemsets from high dimensional datasets have gained focus in recent times. The conventional algorithms expend most of the time in mining small and mid-sized item sets, which do not enclose valuable and complete information for decision making. Mining Frequent Colossal Closed Itemsets (FCCI) from a high dimensional dataset play a highly significant role in decision making for many applications, especially in the field of bioinformatics. To mine FCCI from a high dimensional dataset, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. Besides, the state-of-the-art algorithms for the same are sequential and computationally expensive. The proposed work highlights an Effective Improved Parallel Preprocessing (EIPP) technique to prune the complete set of irrelevant features and irrelevant rows from high dimensional dataset and a novel efficient Parallel Frequent Colossal Closed Itemset Mining (PFCCIM) algorithm. Further, the PFCCIM algorithm is integrated with a novel Rowset Cardinality Table (RCT), an efficient method to check the closeness of a rowset and also an efficient pruning strategy to cut down the mining search space. The proposed PFCCIM algorithm is the first parallel algorithm to mine FCCI from a high dimensional dataset. The performance study shows the improved effectiveness of the proposed EIPP technique over the existing preprocessing techniques and the improved efficiency of the proposed PFCCIM algorithm over the existing algorithms. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:343 / 362
页数:20
相关论文
共 50 条
  • [31] Efficient algorithm for mining closed itemsets
    Liu, Jun-Qiang
    Pan, Yun-He
    Journal of Zhejiang University: Science, 2004, 5 (01): : 8 - 15
  • [32] An efficient algorithm for mining closed itemsets
    Liu Jun-qiang
    Pan Yun-he
    Journal of Zhejiang University-SCIENCE A, 2004, 5 (1): : 8 - 15
  • [33] Distributed load balancing frequent colossal closed itemset mining algorithm for high dimensional dataset
    Vanahalli, Manjunath K.
    Patil, Nagamma
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 144 : 136 - 152
  • [34] An Efficient Mining Algorithm of Closed Frequent Itemsets on Multi-core Processor
    Phan, Huan
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019, 2019, 11888 : 107 - 118
  • [35] TFP: An efficient algorithm for mining top-K frequent closed itemsets
    Wang, JY
    Han, JW
    Lu, Y
    Tzvetkov, P
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 652 - 664
  • [36] An Efficient Mining Model for Global Frequent Closed Itemsets
    Lin, Jianming
    Ju, Chunhua
    Liu, Dongsheng
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 278 - 282
  • [37] Fast and memory efficient mining of frequent closed itemsets
    Lucchese, C
    Orlando, S
    Perego, R
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (01) : 21 - 36
  • [38] NECLATCLOSED: A vertical algorithm for mining frequent closed itemsets
    Aryabarzan, Nader
    Minaei-Bidgoli, Behrouz
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 174
  • [39] An Optimization to CHARM Algorithm for Mining Frequent Closed Itemsets
    Ye, Xin
    Wei, Feng
    Jiang, Fan
    Cheng, Shaoyin
    CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 226 - 235
  • [40] An Algorithm for Mining Frequent Closed Itemsets in Data Stream
    Dai, Caiyan
    Chen, Ling
    INTERNATIONAL CONFERENCE ON APPLIED PHYSICS AND INDUSTRIAL ENGINEERING 2012, PT C, 2012, 24 : 1722 - 1728