An efficient parallel row enumerated algorithm for mining frequent colossal closed itemsets from high dimensional datasets

被引:12
|
作者
Vanahalli, Manjunath K. [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore 575025, India
关键词
Bioinformatics; High dimensional datasets; Parallel preprocessing; Parallel algorithm; Colossal closed itemsets; Rowset cardinality table; PATTERNS;
D O I
10.1016/j.ins.2018.08.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining colossal itemsets from high dimensional datasets have gained focus in recent times. The conventional algorithms expend most of the time in mining small and mid-sized item sets, which do not enclose valuable and complete information for decision making. Mining Frequent Colossal Closed Itemsets (FCCI) from a high dimensional dataset play a highly significant role in decision making for many applications, especially in the field of bioinformatics. To mine FCCI from a high dimensional dataset, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. Besides, the state-of-the-art algorithms for the same are sequential and computationally expensive. The proposed work highlights an Effective Improved Parallel Preprocessing (EIPP) technique to prune the complete set of irrelevant features and irrelevant rows from high dimensional dataset and a novel efficient Parallel Frequent Colossal Closed Itemset Mining (PFCCIM) algorithm. Further, the PFCCIM algorithm is integrated with a novel Rowset Cardinality Table (RCT), an efficient method to check the closeness of a rowset and also an efficient pruning strategy to cut down the mining search space. The proposed PFCCIM algorithm is the first parallel algorithm to mine FCCI from a high dimensional dataset. The performance study shows the improved effectiveness of the proposed EIPP technique over the existing preprocessing techniques and the improved efficiency of the proposed PFCCIM algorithm over the existing algorithms. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:343 / 362
页数:20
相关论文
共 50 条
  • [41] An Improved MapReduce Algorithm for Mining Closed Frequent Itemsets
    Gonen, Yaron
    Gudes, Ehud
    2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE SCIENCE, TECHNOLOGY AND ENGINEERING (SWSTE 2016), 2016, : 77 - 83
  • [42] A New Algorithm for Mining Frequent Closed Itemsets from Data Streams
    Mao, Guojun
    Yang, Xialing
    Wu, Xindong
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 154 - +
  • [43] BitTableFI: An efficient mining frequent itemsets algorithm
    Dong, Jie
    Han, Min
    KNOWLEDGE-BASED SYSTEMS, 2007, 20 (04) : 329 - 335
  • [44] An Algorithm for Mining Frequent Closed Itemsets in Data Stream
    Dai, Caiyan
    Chen, Ling
    2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL I, 2010, : 281 - 284
  • [45] An algorithm for mining frequent closed itemsets with density from data streams
    Caiyan D.
    Ling C.
    Caiyan, Dai (daicaiyan@gmail.com), 1600, Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (12): : 146 - 154
  • [46] An algorithm for mining frequent closed itemsets with density from data streams
    Dai Caiyan
    Chen Ling
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2016, 12 (2-3) : 146 - 154
  • [47] A Novel Incremental Algorithm for Frequent Itemsets Mining in Dynamic Datasets
    Hernandez-Leon, Raudel
    Hernandez-Palancar, Jose
    Carrasco-Ochoa, J. A.
    Martinez-Trinidad, J. Fco
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2008, 5197 : 145 - +
  • [48] An efficient biobjective evolutionary algorithm for mining frequent and high utility itemsets
    Fang, Wei
    Li, Chongyang
    Zhang, Qiang
    Zhang, Xin
    Lin, Jerry Chun-Wei
    APPLIED SOFT COMPUTING, 2023, 140
  • [49] An Efficient Subset-Lattice Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Chang, Ye-In
    Li, Chia-En
    Peng, Wei-Hau
    2012 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2012, : 21 - 26
  • [50] Efficient mining of high utility itemsets from large datasets
    Erwin, Alva
    Gopalan, Raj P.
    Achuthan, N. R.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 554 - +