An Efficient Algorithm for Mining Large Item Sets

被引:1
|
作者
Zheng, Hong-Zhen [1 ]
Chu, Dian-Hui [1 ]
Zhan, De-Chen [1 ]
Xu, Xiao-Fei [1 ]
机构
[1] Harbin Inst Technol, Coll Comp Sci & Technol, Weihai 264209, Peoples R China
关键词
Large item sets; Data mining; Association rules;
D O I
10.1109/FSKD.2008.679
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It propose Online Mining Algorithm (OMA) which online discover large item sets. Without presetting a default threshold, the OMA algorithm achieves its efficiency and threshold-flexibility by calculating item-sets' counts. It is unnecessary and independent of the default threshold and can flexibly adapt to any user's input threshold. In addition, we propose Cluster-Based Association Rule Algorithm (CARA) creates cluster tables to aid discovery of large item sets. It only requires a single scan of the database, followed by contrasts with the partial cluster tables. It not only prunes considerable amounts of data reducing the time needed to perform data scans and requiring less contrast, but also ensures the correctness of the mined results. By using the CARA algorithm to create cluster tables in advance, each CPU can be utilized to process a cluster table; thus large item sets can be immediately mined even when the database is very large.
引用
收藏
页码:561 / 564
页数:4
相关论文
共 50 条
  • [21] An efficient algorithm for mining top-k closed frequent item sets over data streams over data streams
    Yimin, Mao
    Xiaofang, Xue
    Jinqing, Chen
    Telkomnika - Indonesian Journal of Electrical Engineering, 2013, 11 (07): : 3759 - 3766
  • [22] Efficient algorithms for mining outliers from large data sets
    Ramaswamy, S
    Rastogi, R
    Shim, K
    SIGMOD RECORD, 2000, 29 (02) : 427 - 438
  • [23] An efficient and flexible algorithm for online mining of large itemsets
    Jea, KF
    Chang, MY
    Lin, KC
    INFORMATION PROCESSING LETTERS, 2004, 92 (06) : 311 - 316
  • [24] Comparative Analysis of Genetic Based Approach and Apriori Algorithm for Mining Maximal Frequent Item Sets
    Kabir, Mir Md. Jahangir
    Xu, Shuxiang
    Kang, Byeong Ho
    Zhao, Zongyuan
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 39 - 45
  • [25] A Generalized MapReduce Approach for Efficient mining of Large data Sets in the GRID
    Roehm, Matthias
    Grabert, Matthias
    Schweiggert, Franz
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, GRIDS, AND VIRTUALIZATION (CLOUD COMPUTING 2010), 2010, : 14 - 19
  • [26] Incremental Technique with Set of Frequent Word Item sets for Mining Large Indonesian Text Data
    Maylawati, Dian Sa'adillah
    Ramdhani, Muhammad Ali
    Rahman, Ali
    Darmalaksana, Wahyudin
    2017 5TH INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM 2017), 2017, : 12 - 17
  • [27] An efficient algorithm for distributed incremental updating of frequent item-sets on massive database
    Qiu, Jiangtao
    Tang, Changjie
    Duan, Lei
    Li, Chuan
    Qiao, Shaojie
    Chen, Peng
    Liu, Qihong
    WEB INFORMATION SYSTEMS - WISE 2006 WORKSHOPS, PROCEEDINGS, 2006, 4256 : 61 - 72
  • [28] EFFICIENT ALGORITHM FOR RATIONAL KERNEL EVALUATION IN LARGE LATTICE SETS
    Svec, Jan
    Ircing, Pavel
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3133 - 3137
  • [29] An Efficient Algorithm for Discovering Motifs in Large DNA Data Sets
    Yu, Qiang
    Huo, Hongwei
    Chen, Xiaoyang
    Guo, Haitao
    Vitter, Jeffrey Scott
    Huan, Jun
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2015, 14 (05) : 535 - 544
  • [30] An Efficient Motif Finding Algorithm for Large DNA Data Sets
    Yu, Qiang
    Huo, Hongwei
    Chen, Xiaoyang
    Guo, Haitao
    Vitter, Jeffrey Scott
    Huan, Jun
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,