Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data

被引:2
|
作者
Ghosh, Moumita [1 ]
Roy, Anirban [2 ]
Sil, Pritam [1 ]
Mondal, Kartick Chandra [1 ]
机构
[1] Jadavpur Univ, Dept Informat Technol, Kolkata, India
[2] West Bengal Biodivers Board, Kolkata, India
关键词
Frequent itemset; Cellular learning automata; FP-growth; Biodiversity; Application; Species data; PATTERNS; ALGORITHMS;
D O I
10.1007/s11334-022-00500-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The efficient discovery of frequent itemsets from a transaction database is the fundamental step for association rule mining in data analytics. Interesting associations among the items present in a transaction database contribute to knowledge enrichment. Thus, decision-making and pattern generation from the massive amounts of data become effortless. But one of the major problems associated with the algorithms of frequent itemset mining is excessive memory requirements, which cause them to be inappropriate for larger datasets with itemsets having high cardinality. A few novel data structures for mining frequent itemsets have been introduced in recent years. For example, N-List, NodeSet, DiffNodeSet, proximity list, etc. have been proposed that show a coherent mining approach for improving the execution time while still leaving the scope for further improvements in memory requirements. In this paper, we propose a novel algorithm using cellular learning automata (CLA) and multiple FP tree structures for frequent itemset mining that is efficient in both time and memory requirements. Extensive experimentation has been performed by comparing the performance of the proposed method with the leading algorithms and using publicly available real and synthetic datasets designed specifically for pattern mining algorithms. It can be concluded that the proposed method is memory-efficient and shows comparable execution time with varying dataset dimensions and dataset density, assuring its robustness. In addition to the proposal of the new methodology for frequent itemset mining, its potential domain-specific usage in species biodiversity data analysis has also been discussed. The fact that which groups of species are closely related can be derived from huge occurrence records of species datasets. This could help in understanding species co-occurrence in multiple sites, which in turn assists in solving ecology-related issues for afforesting and reforesting. It could be a step forward toward the advantageous use of computer science in the biodiversity domain.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
  • [41] Data Classification Based on the Class-Rooted FP-Tree Approach
    Chang, Ye-In
    Wu, Chen-Chang
    Shen, Jun-Hong
    Chen, Chien-Hung
    CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 506 - 511
  • [42] Efficiently Mining Closed Frequent Patterns with Weight Constraint from Directed Graph Traversals Using Weighted FP-tree Approach
    Geng, Runian
    Dong, Xiangjun
    Zhang, Xingye
    Xu, Wenbo
    2008 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL 3, PROCEEDINGS, 2008, : 399 - +
  • [43] Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters
    Zhou, Jiayi
    Yu, Kun-Ming
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2008, 5036 : 18 - 28
  • [44] New approach in Big Data Mining for frequent itemset using mapreduce in HDFS
    Nikam, Pallavi V.
    Deshpande, Deepa S.
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [45] A frequent itemset generation approach in data mining using transaction-labelling dynamic itemset counting method
    Balaram, Ambily
    Raju, Nedunchezhian
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2025, 17 (01)
  • [46] An Efficient Outlier Detection Approach Over Uncertain Data Stream Based on Frequent Itemset Mining
    Hao, Shangbo
    Cai, Saihua
    Sun, Ruizhi
    Li, Sicong
    INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (01): : 34 - 46
  • [47] A Novel Nodesets-Based Frequent Itemset Mining Algorithm for Big Data using MapReduce
    Sivaiah, Borra
    Rao, Ramisetty Rajeswara
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 1051 - 1058
  • [48] InterTARM: FP-tree based Framework for Mining Inter-transaction Association Rules from Stock Market Data
    Chhinkaniwala, Hitesh
    Thilagam, P. Santhi
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2008, : 513 - 517
  • [49] An uncertainty-based approach: Frequent itemset mining from uncertain data with different item importance
    Lee, Gangin
    Yun, Unil
    Ryang, Heungmo
    KNOWLEDGE-BASED SYSTEMS, 2015, 90 : 239 - 256
  • [50] A Push Strategy for delivering of Learning Objects using meta data based association analysis (FP-Tree)
    Sabitha, A. Sai
    Mehrotra, Deepti
    2013 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS, 2013,