Frequent pattern mining on stream data using Hadoop CanTree-GTree

被引:7
|
作者
Kusumakumari, Vanteru [1 ]
Sherigar, Deepthi [1 ]
Chandran, Roshni [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Surathkal 575025, India
关键词
Stream data mining; Frequent item sets; GTree; CanTree; Hadoop; ITEMSETS;
D O I
10.1016/j.procs.2017.09.134
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The need for knowledge discovery from real-time stream data is continuously increasing nowadays and processing of transactions for mining patterns needs efficient data structures and algorithms. We propose a time-efficient Hadoop CanTree-GTree algorithm, using Apache Hadoop. This algorithm mines the complete frequent item sets (patterns) from real time transactions, by utilizing the sliding window technique. These are used to mine for closed frequent item sets and then, association rules are derived. It makes use of two data structures - CanTree and GTree. The results show that the Hadoop implementation of the algorithm performs 5 times better than in Java. (C) 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 7th International Conference on Advances in Computing & Communications.
引用
收藏
页码:266 / 273
页数:8
相关论文
共 50 条
  • [1] Real-time stream data mining based on CanTree and Gtree
    Kim, Jaein
    Hwang, Buhyun
    INFORMATION SCIENCES, 2016, 367 : 512 - 528
  • [2] Frequent Pattern Mining for Dynamic Database by Using Hadoop GM-Tree and GTree
    Aung, Than Htike
    Kham, Nang Saing Moon
    BIG DATA ANALYSIS AND DEEP LEARNING APPLICATIONS, 2019, 744 : 149 - 159
  • [3] An Efficient Frequent Pattern Mining Algorithm for Data Stream
    Liu Hualei
    Lin Shukuan
    Qiao Jianzhong
    Yu Ge
    Lu Kaifu
    INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL 1, PROCEEDINGS, 2008, : 757 - 761
  • [4] CanTree: a canonical-order tree for incremental frequent-pattern mining
    Leung, Carson Kai-Sang
    Khan, Quamrul I.
    Li, Zhan
    Hoque, Tariqul
    KNOWLEDGE AND INFORMATION SYSTEMS, 2007, 11 (03) : 287 - 311
  • [5] CanTree: a canonical-order tree for incremental frequent-pattern mining
    Carson Kai-Sang Leung
    Quamrul I. Khan
    Zhan Li
    Tariqul Hoque
    Knowledge and Information Systems, 2007, 11 : 287 - 311
  • [6] Balanced Parallel Frequent Pattern Mining Over Massive Data Stream
    Fu, Xi
    Shi, Lei
    Li, Jing
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 50 - 59
  • [7] An Improved PrePost Algorithm for Frequent Pattern Mining with Hadoop on Cloud
    Thakare, Sanket
    Rathi, Sheetal
    Sedamkar, R. R.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING AND VIRTUALIZATION (ICCCV) 2016, 2016, 79 : 207 - 214
  • [8] Frequent Pattern Mining with Uncertain Data
    Aggarwal, Charu C.
    Li, Yan
    Wang, Jianyong
    Wang, Jing
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 29 - 37
  • [9] A Novel Frequent Pattern Mining Algorithm for Real-time Radar Data Stream
    Huang, Fang
    Zheng, Ningning
    TRAITEMENT DU SIGNAL, 2019, 36 (01) : 23 - 30
  • [10] A novel frequent pattern mining technique for prediction of user behavior on web stream data
    Dhanalakshmi P.
    Ingenierie des Systemes d'Information, 2019, 24 (01): : 51 - 56