MAFIA: A maximal frequent itemset algorithm

被引:146
|
作者
Burdick, D
Calimlim, M
Flannick, J
Gehrke, J
Yiu, TM
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
[2] Cornell Univ, Ithaca, NY 14853 USA
[3] Stanford Univ, Janes H Clark Ctr, Stanford, CA 94305 USA
[4] Amazon Com, Seattle, WA 98104 USA
基金
美国国家科学基金会;
关键词
itemset mining; maximal itemsets; transactional databases;
D O I
10.1109/TKDE.2005.183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new algorithm for mining maximal frequent itemsets from a transactional database. The search strategy of the algorithm integrates a depth-first traversal of the itemset lattice with effective pruning mechanisms that significantly improve mining performance. Our implementation for support counting combines a vertical bitmap representation of the data with an efficient bitmap compression scheme. In a thorough experimental analysis, we isolate the effects of individual components of MAFIA including search space pruning techniques and adaptive compression. We also compare our performance with previous work by running tests on very different types of data sets. Our experiments show that MAFIA performs best when mining long itemsets and outperforms other algorithms on dense data by a factor of three to 30.
引用
收藏
页码:1490 / 1504
页数:15
相关论文
共 50 条
  • [31] Frequent Itemset Mining Algorithm based on Sampling Method
    Li, Haifeng
    Zhang, Ning
    Zhang, Yuejin
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 852 - 855
  • [32] A New Parallel Algorithm for the Frequent Itemset Mining Problem
    Craus, Mitica
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, 2008, : 165 - 170
  • [33] Frequent Itemset Mining Algorithm Based on Linear Table
    Lu, Jun
    Xu, Wenhe
    Zhou, Kailong
    Guo, Zhicong
    JOURNAL OF DATABASE MANAGEMENT, 2023, 34 (01)
  • [34] A novel algorithm for frequent itemset mining in data warehouses
    徐利军
    谢康林
    Journal of Zhejiang University Science A(Science in Engineering), 2006, (02) : 216 - 224
  • [35] A Distributed Frequent Itemset Mining Algorithm Based on Spark
    Gui, Feng
    Ma, Yunlong
    Zhang, Feng
    Liu, Min
    Li, Fei
    Shen, Weiming
    Bai, Hua
    PROCEEDINGS OF THE 2015 IEEE 19TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2015, : 271 - 275
  • [36] A Review of the Maximal Frequent Itemset Mining Algorithms over Dynamically Changed Data
    Li, Haifeng
    Proceedings of the 2016 International Symposium on Advances in Electrical, Electronics and Computer Engineering (ISAEECE), 2016, 69 : 346 - 350
  • [37] Approximation of Probabilistic Maximal Frequent Itemset Mining Over Uncertain Sensed Data
    Chen, Sheng
    Nie, Lihai
    Tao, Xiaoyi
    Li, Zhiyang
    Zhao, Laiping
    IEEE ACCESS, 2020, 8 : 97529 - 97539
  • [38] Analysing SEER cancer data using signed maximal frequent itemset networks
    Kocak, Yunuscan
    Ozyer, Tansel
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2021, 26 (1-2) : 20 - 58
  • [39] Extreme Maximal Weighted Frequent Itemset Mining for Cognitive Frequency Decision Making
    Ji Pan-pan
    Liao Ming-Xue
    He Xiao-Xin
    Deng Yong
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 267 - 271
  • [40] A novel parallel frequent itemset mining algorithm for automatic enterprise
    Mao, Yimin
    Wu, Bin
    Deng, Qianhu
    Mahmoodi, Soroosh
    Chen, Zhigang
    Chen, Yeh-Cheng
    ENTERPRISE INFORMATION SYSTEMS, 2023, 17 (10)