An efficient parallel row enumerated algorithm for mining frequent colossal closed itemsets from high dimensional datasets

被引:12
|
作者
Vanahalli, Manjunath K. [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore 575025, India
关键词
Bioinformatics; High dimensional datasets; Parallel preprocessing; Parallel algorithm; Colossal closed itemsets; Rowset cardinality table; PATTERNS;
D O I
10.1016/j.ins.2018.08.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining colossal itemsets from high dimensional datasets have gained focus in recent times. The conventional algorithms expend most of the time in mining small and mid-sized item sets, which do not enclose valuable and complete information for decision making. Mining Frequent Colossal Closed Itemsets (FCCI) from a high dimensional dataset play a highly significant role in decision making for many applications, especially in the field of bioinformatics. To mine FCCI from a high dimensional dataset, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. Besides, the state-of-the-art algorithms for the same are sequential and computationally expensive. The proposed work highlights an Effective Improved Parallel Preprocessing (EIPP) technique to prune the complete set of irrelevant features and irrelevant rows from high dimensional dataset and a novel efficient Parallel Frequent Colossal Closed Itemset Mining (PFCCIM) algorithm. Further, the PFCCIM algorithm is integrated with a novel Rowset Cardinality Table (RCT), an efficient method to check the closeness of a rowset and also an efficient pruning strategy to cut down the mining search space. The proposed PFCCIM algorithm is the first parallel algorithm to mine FCCI from a high dimensional dataset. The performance study shows the improved effectiveness of the proposed EIPP technique over the existing preprocessing techniques and the improved efficiency of the proposed PFCCIM algorithm over the existing algorithms. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:343 / 362
页数:20
相关论文
共 50 条
  • [1] An efficient dynamic switching algorithm for mining colossal closed itemsets from high dimensional datasets
    Vanahalli, Manjunath K.
    Patil, Nagamma
    DATA & KNOWLEDGE ENGINEERING, 2019, 123
  • [2] An efficient algorithm for mining frequent closed itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    Informatica (Slovenia), 2015, 39 (01): : 87 - 98
  • [3] An Efficient Algorithm for Mining Frequent Closed Itemsets
    Fang, Gang
    Wu, Yue
    Li, Ming
    Chen, Jia
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2015, 39 (01): : 87 - 98
  • [4] An efficient algorithm for incrementally mining frequent closed itemsets
    Show-Jane Yen
    Yue-Shi Lee
    Chiu-Kuang Wang
    Applied Intelligence, 2014, 40 : 649 - 668
  • [5] An efficient algorithm for incrementally mining frequent closed itemsets
    Yen, Show-Jane
    Lee, Yue-Shi
    Wang, Chiu-Kuang
    APPLIED INTELLIGENCE, 2014, 40 (04) : 649 - 668
  • [6] Association Analysis of Significant Frequent Colossal Itemsets Mined from High Dimensional Datasets
    Vanahalli, Manjunath K.
    Patil, Nagamma
    2016 IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ELECTRONICS ENGINEERING (UPCON), 2016, : 258 - 263
  • [7] PNPFI: An Efficient Parallel Frequent Itemsets Mining Algorithm
    Zhang, Fang
    Zhang, Yu
    Liao, Xiaofei
    Jin, Hai
    PROCEEDINGS OF THE 2018 IEEE 22ND INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN ((CSCWD)), 2018, : 172 - 177
  • [8] PGLCM: efficient parallel mining of closed frequent gradual itemsets
    Trong Dinh Thac Do
    Termier, Alexandre
    Laurent, Anne
    Negrevergne, Benjamin
    Omidvar-Tehrani, Behrooz
    Amer-Yahia, Sihem
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 43 (03) : 497 - 527
  • [9] PGLCM: efficient parallel mining of closed frequent gradual itemsets
    Trong Dinh Thac Do
    Alexandre Termier
    Anne Laurent
    Benjamin Negrevergne
    Behrooz Omidvar-Tehrani
    Sihem Amer-Yahia
    Knowledge and Information Systems, 2015, 43 : 497 - 527
  • [10] An Algorithm of Mining Closed Frequent Itemsets
    Li, Haifeng
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 95 - 98