Frequent itemset mining using FP-tree: a CLA-based approach and its extended application in biodiversity data

被引:2
|
作者
Ghosh, Moumita [1 ]
Roy, Anirban [2 ]
Sil, Pritam [1 ]
Mondal, Kartick Chandra [1 ]
机构
[1] Jadavpur Univ, Dept Informat Technol, Kolkata, India
[2] West Bengal Biodivers Board, Kolkata, India
关键词
Frequent itemset; Cellular learning automata; FP-growth; Biodiversity; Application; Species data; PATTERNS; ALGORITHMS;
D O I
10.1007/s11334-022-00500-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The efficient discovery of frequent itemsets from a transaction database is the fundamental step for association rule mining in data analytics. Interesting associations among the items present in a transaction database contribute to knowledge enrichment. Thus, decision-making and pattern generation from the massive amounts of data become effortless. But one of the major problems associated with the algorithms of frequent itemset mining is excessive memory requirements, which cause them to be inappropriate for larger datasets with itemsets having high cardinality. A few novel data structures for mining frequent itemsets have been introduced in recent years. For example, N-List, NodeSet, DiffNodeSet, proximity list, etc. have been proposed that show a coherent mining approach for improving the execution time while still leaving the scope for further improvements in memory requirements. In this paper, we propose a novel algorithm using cellular learning automata (CLA) and multiple FP tree structures for frequent itemset mining that is efficient in both time and memory requirements. Extensive experimentation has been performed by comparing the performance of the proposed method with the leading algorithms and using publicly available real and synthetic datasets designed specifically for pattern mining algorithms. It can be concluded that the proposed method is memory-efficient and shows comparable execution time with varying dataset dimensions and dataset density, assuring its robustness. In addition to the proposal of the new methodology for frequent itemset mining, its potential domain-specific usage in species biodiversity data analysis has also been discussed. The fact that which groups of species are closely related can be derived from huge occurrence records of species datasets. This could help in understanding species co-occurrence in multiple sites, which in turn assists in solving ecology-related issues for afforesting and reforesting. It could be a step forward toward the advantageous use of computer science in the biodiversity domain.
引用
收藏
页码:283 / 301
页数:19
相关论文
共 50 条
  • [21] An improved algorithm for mining maximal frequent itemsets based on FP-tree
    Chen TongQing
    Ye FeiYue
    Ge XiCong
    Liu Qi
    2018 17TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES), 2018, : 225 - 228
  • [22] Improved algorithm for mining maximum frequent patterns based on FP-Tree
    Liu, Naili
    Ma, Lei
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 833 - 836
  • [23] FREQUENT ITEMSETS MINING ALGORITHM BASED ON DIFFERENTIAL PRIVACY AND FP-TREE
    Ding Zhe
    Wu Chunwang
    Zhao Jun
    Li Binyong
    2020 17TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2020, : 271 - 274
  • [24] Mining weighted frequent patterns based on an improved weighted FP-tree
    Wang, Yan
    Xue, Haiyan
    Journal of Information and Computational Science, 2010, 7 (02): : 527 - 533
  • [25] Mining frequent itemsets with positive and negative items based on FP-tree
    College of Computer Science, Chongqing University, Chongqing 400030, China
    不详
    不详
    Moshi Shibie yu Rengong Zhineng, 2008, 2 (246-253):
  • [26] Mining maximal frequent patterns based on improved FP-tree in database
    Liu Wenzhou
    Hao Xinghai
    Meng Xiangping
    Wang Huajin
    3RD INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS, AND APPLICAT/4TH INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 2, 2006, : 297 - +
  • [27] A New Algorithm For Frequent Itemsets Mining Based On Apriori And FP-Tree
    Lan, Qihua
    Zhang, Defu
    Wu, Bo
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL II, 2009, : 360 - 364
  • [28] Fast Mining Maximal Frequent Itemsets Based On Sorted FP-Tree
    Yang, Junrui
    Guo, Yunkai
    Liu, Nanyan
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 5391 - 5395
  • [29] Modified FP-Growth: An Efficient Frequent Pattern Mining Approach from FP-Tree
    Ahmed, Shafiul Alom
    Nath, Bhabesh
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT I, 2019, 11941 : 47 - 55
  • [30] Visualizing the Construction of Incremental Disorder Trie Itemset Data Structure (DOSTrieIT) for Frequent Pattern Tree (FP-Tree)
    Abdullah, Zailani
    Herawan, Tutut
    Deris, Mustafa Mat
    VISUAL INFORMATICS: SUSTAINING RESEARCH AND INNOVATIONS, PT I, 2011, 7066 : 183 - 195