Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data

被引:2
|
作者
Ghosh, Moumita [1 ]
Roy, Anirban [2 ]
Sil, Pritam [1 ]
Mondal, Kartick Chandra [1 ]
机构
[1] Jadavpur Univ, Dept Informat Technol, Kolkata, India
[2] West Bengal Biodivers Board, Kolkata, India
关键词
Frequent itemset; Cellular learning automata; FP-growth; Biodiversity; Application; Species data; PATTERNS; ALGORITHMS;
D O I
10.1007/s11334-022-00500-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The efficient discovery of frequent itemsets from a transaction database is the fundamental step for association rule mining in data analytics. Interesting associations among the items present in a transaction database contribute to knowledge enrichment. Thus, decision-making and pattern generation from the massive amounts of data become effortless. But one of the major problems associated with the algorithms of frequent itemset mining is excessive memory requirements, which cause them to be inappropriate for larger datasets with itemsets having high cardinality. A few novel data structures for mining frequent itemsets have been introduced in recent years. For example, N-List, NodeSet, DiffNodeSet, proximity list, etc. have been proposed that show a coherent mining approach for improving the execution time while still leaving the scope for further improvements in memory requirements. In this paper, we propose a novel algorithm using cellular learning automata (CLA) and multiple FP tree structures for frequent itemset mining that is efficient in both time and memory requirements. Extensive experimentation has been performed by comparing the performance of the proposed method with the leading algorithms and using publicly available real and synthetic datasets designed specifically for pattern mining algorithms. It can be concluded that the proposed method is memory-efficient and shows comparable execution time with varying dataset dimensions and dataset density, assuring its robustness. In addition to the proposal of the new methodology for frequent itemset mining, its potential domain-specific usage in species biodiversity data analysis has also been discussed. The fact that which groups of species are closely related can be derived from huge occurrence records of species datasets. This could help in understanding species co-occurrence in multiple sites, which in turn assists in solving ecology-related issues for afforesting and reforesting. It could be a step forward toward the advantageous use of computer science in the biodiversity domain.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
  • [31] Comparative Analysis of Frequent Pattern Mining for Large Data Using FP-Tree and CP-Tree Methods
    Annapoorna, V.
    Murty, M. Rama Krishna
    Priyanka, J. S. V. S. Hari
    Chittineni, Suresh
    INFORMATION AND DECISION SCIENCES, 2018, 701 : 59 - 67
  • [32] Dynamic FP-Tree based mining of frequent patterns satisfying succinct constraints
    Leung, CKS
    CONSTRAINT DATABASES, PROCEEDINGS, 2004, 3074 : 112 - 127
  • [33] Interactive association rules mining based on FP-Tree and its application in education management
    Huang Tao
    Jiang Hao
    Pu An-jian
    ICAIE 2009: PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND EDUCATION, VOLS 1 AND 2, 2009, : 711 - 715
  • [34] An Efficient Frequent Patterns Mining Algorithm based on Apriori Algorithm and the FP-tree Structure
    Wu, Bo
    Zhang, Defu
    Lan, Qihua
    Zheng, Jiemin
    THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 1099 - +
  • [35] WHFPMiner: Efficient Mining of Weighted Highly-Correlated Frequent Patterns Based on Weighted FP-Tree Approach
    Geng, Runian
    Dong, Xiangjun
    Zhao, Jing
    Xu, Wenbo
    ADVANCES IN NEURAL NETWORKS - ISNN 2008, PT 2, PROCEEDINGS, 2008, 5264 : 730 - 739
  • [36] A Non-Recursive Approach for FP-tree based Frequent Pattern Generation
    Jalan, Shalini
    Srivastava, Anurag
    Sharma, G. K.
    2009 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT: SCORED 2009, PROCEEDINGS, 2009, : 160 - 163
  • [37] FP-NoSQL: An Efficient Frequent Itemset Mining Algorithm Using the FP-DB Approach
    Chee, Chin-Hoong
    Jaafar, Jafreezal
    Aziz, Izzatdin Abdul
    2018 IEEE CONFERENCE ON BIG DATA AND ANALYTICS (ICBDA), 2018, : 80 - 86
  • [38] A FP-tree based partition mining approach to discovering temporal association rules
    Ma, Hui
    Tang, Yong
    Pan, Yan
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (17): : 132 - 134
  • [40] An FP-tree based approach for mining all strongly correlated item pairs
    He, ZY
    Deng, SC
    Xu, XF
    COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 735 - 740