Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data

被引:2
|
作者
Ghosh, Moumita [1 ]
Roy, Anirban [2 ]
Sil, Pritam [1 ]
Mondal, Kartick Chandra [1 ]
机构
[1] Jadavpur Univ, Dept Informat Technol, Kolkata, India
[2] West Bengal Biodivers Board, Kolkata, India
关键词
Frequent itemset; Cellular learning automata; FP-growth; Biodiversity; Application; Species data; PATTERNS; ALGORITHMS;
D O I
10.1007/s11334-022-00500-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The efficient discovery of frequent itemsets from a transaction database is the fundamental step for association rule mining in data analytics. Interesting associations among the items present in a transaction database contribute to knowledge enrichment. Thus, decision-making and pattern generation from the massive amounts of data become effortless. But one of the major problems associated with the algorithms of frequent itemset mining is excessive memory requirements, which cause them to be inappropriate for larger datasets with itemsets having high cardinality. A few novel data structures for mining frequent itemsets have been introduced in recent years. For example, N-List, NodeSet, DiffNodeSet, proximity list, etc. have been proposed that show a coherent mining approach for improving the execution time while still leaving the scope for further improvements in memory requirements. In this paper, we propose a novel algorithm using cellular learning automata (CLA) and multiple FP tree structures for frequent itemset mining that is efficient in both time and memory requirements. Extensive experimentation has been performed by comparing the performance of the proposed method with the leading algorithms and using publicly available real and synthetic datasets designed specifically for pattern mining algorithms. It can be concluded that the proposed method is memory-efficient and shows comparable execution time with varying dataset dimensions and dataset density, assuring its robustness. In addition to the proposal of the new methodology for frequent itemset mining, its potential domain-specific usage in species biodiversity data analysis has also been discussed. The fact that which groups of species are closely related can be derived from huge occurrence records of species datasets. This could help in understanding species co-occurrence in multiple sites, which in turn assists in solving ecology-related issues for afforesting and reforesting. It could be a step forward toward the advantageous use of computer science in the biodiversity domain.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
  • [1] Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data
    Moumita Ghosh
    Anirban Roy
    Pritam Sil
    Kartick Chandra Mondal
    Innovations in Systems and Software Engineering, 2023, 19 : 283 - 301
  • [2] Mining φ-Frequent Itemset Using FP-Tree
    李天瑞
    Journal of Southwest Jiaotong University, 2001, (01) : 67 - 74
  • [3] Mining Closed Frequent Itemset based on FP-Tree
    Li, Shengwei
    Li, Lingsheng
    Han, Chong
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 354 - 357
  • [4] Inverse frequent itemset mining based on FP-tree
    Department of Computer Science and Technology, Peking University, Beijing 100871, China
    不详
    Ruan Jian Xue Bao, 2008, 2 (338-350): : 338 - 350
  • [5] Mining frequent trajectory using FP-tree in GPS data
    Li, J. (lijunhuai@xaut.edu.cn), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [6] Building FP-Tree on the Fly: Single-Pass Frequent Itemset Mining
    Shahbazi, Nima
    Soltani, Rohollah
    Gryz, Jarek
    An, Aijun
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 387 - 400
  • [7] Mining maximal frequent itemsets in data streams based on FP-Tree
    Ao, Fujiang
    Yan, Yuejin
    Huang, Jian
    Huang, Kedi
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2007, 4571 : 479 - +
  • [8] Mining frequent patterns based on compressed FP-tree without conditional FP-tree generation
    Chen, Fei
    Shang, Lin
    Li, Ming
    Chen, Zhao-qian
    Chen, Shi-fu
    2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 478 - +
  • [9] Parallelization of Frequent Itemset Mining Methods with FP-tree: An Experiment with PrePost+ Algorithm
    Jamsheela, Olakara
    Gopalakrishna, Raju
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (02) : 208 - 213
  • [10] An irregular CLA-based novel frequent pattern mining approach
    Ghosh, Moumita
    Mondal, Sourav
    Moondra, Harshita
    Utari, Dina Tri
    Roy, Anirban
    Mondal, Kartick Chandra
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2024, 16 (03)