MAFIA: A maximal frequent itemset algorithm

被引:146
|
作者
Burdick, D
Calimlim, M
Flannick, J
Gehrke, J
Yiu, TM
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
[2] Cornell Univ, Ithaca, NY 14853 USA
[3] Stanford Univ, Janes H Clark Ctr, Stanford, CA 94305 USA
[4] Amazon Com, Seattle, WA 98104 USA
基金
美国国家科学基金会;
关键词
itemset mining; maximal itemsets; transactional databases;
D O I
10.1109/TKDE.2005.183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new algorithm for mining maximal frequent itemsets from a transactional database. The search strategy of the algorithm integrates a depth-first traversal of the itemset lattice with effective pruning mechanisms that significantly improve mining performance. Our implementation for support counting combines a vertical bitmap representation of the data with an efficient bitmap compression scheme. In a thorough experimental analysis, we isolate the effects of individual components of MAFIA including search space pruning techniques and adaptive compression. We also compare our performance with previous work by running tests on very different types of data sets. Our experiments show that MAFIA performs best when mining long itemsets and outperforms other algorithms on dense data by a factor of three to 30.
引用
收藏
页码:1490 / 1504
页数:15
相关论文
共 50 条
  • [41] Frequent itemset mining-based spatial subclustering algorithm
    Wang, Qian
    Gao, Zhi-Peng
    Qiu, Xue-Song
    Wang, Xing-Bin
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2015, 38 : 20 - 23
  • [42] AnyFI: An Anytime Frequent Itemset Mining Algorithm for Data Streams
    Goyal, Poonam
    Challa, Jagat Sesh
    Shrivastava, Shivin
    Goyal, Navneet
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 942 - 947
  • [43] A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
    Fumarola, Fabio
    Malerba, Donato
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 335 - 342
  • [44] An efficient polynomial delay algorithm for pseudo frequent itemset mining
    Uno, Takeaki
    Arimura, Hiroki
    DISCOVERY SCIENCE, PROCEEDINGS, 2007, 4755 : 219 - +
  • [45] A Spark-based Incremental Algorithm for Frequent Itemset Mining
    Wen, Haoxing
    Li, Xiaoguang
    Kou, Mingdong
    Tou, Huaixiao
    He, Hengyi
    Yang, Yulu
    BDIOT 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS, 2018, : 53 - 58
  • [46] Improvement of Eclat Algorithm Based on Support in Frequent Itemset Mining
    Yu, Xiaomei
    Wang, Hong
    JOURNAL OF COMPUTERS, 2014, 9 (09) : 2116 - 2123
  • [47] A frequent itemset mining algorithm based on composite granular computing
    Wu, Hongjuan
    Liu, Yulu
    Yan, Pei
    Fang, Gang
    Zhong, Jing
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2018, 18 (01) : 247 - 257
  • [48] A Heuristic Rule based Approximate Frequent Itemset Mining Algorithm
    Li, Haifeng
    Zhang, Yuejin
    Zhang, Ning
    Jia, Hengyue
    PROMOTING BUSINESS ANALYTICS AND QUANTITATIVE MANAGEMENT OF TECHNOLOGY: 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2016), 2016, 91 : 324 - 333
  • [49] Implementation of an Improved Algorithm for Frequent Itemset Mining using Hadoop
    Agarwal, Ruchi
    Singh, Sunny
    Vats, Satvik
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 13 - 18
  • [50] Hybrid Approach for Improving Efficiency of Apriori Algorithm on Frequent Itemset
    Altameem, Arwa
    Ykhlef, Mourad
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (05): : 151 - 155