Map-optimize-reduce: CAN tree assisted FP-growth algorithm for clusters based FP mining on Hadoop

被引:21
|
作者
Ragaventhiran, J. [1 ]
Kavithadevi, M. K. [2 ]
机构
[1] Syed Ammal Engn Coll, Dept CSE, Ramanathapuram, India
[2] Thiagarajar Coll Engn, Dept CSE, Madurai, Tamil Nadu, India
关键词
Frequent pattern mining; Map-optimize-reduce; Clustering; Load balancing; CAN tree based FP growth; User query; FREQUENT PATTERNS; SEQUENTIAL PATTERNS; SKEWED DATA; MAPREDUCE;
D O I
10.1016/j.future.2019.09.041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the past era, Frequent Pattern Mining (FPM) is emerging as a significant approach to discover fascinating knowledge concealed in the data. However, preceding works failed to address the validation of FPM with user queries and also achieving better scalability and execution time is still bottleneck owing to difficulties in handling large dataset. To address this downside, our proposed work establishes FPM using extend version of MapReduce framework in Hadoop environment. Our proposed work comprises of five processes that are: 1) Preprocessing 2) Affinity Propagation (AP) based Clustering 3) Load Balancing 4) Map-Optimize-Reduce 5) Mining User Queries. Primarily, our proposed work performs preprocessing to remove data redundancy. To speed up the MapReduce framework, we propose AP clustering which generates effective clusters from the given dataset. Load balancing is executed to balance load among different blocks concerning where reputation is computed. To avoid oversight in scanning and minimal searching space in MapReduce, optimizer is included between Mapper and Reducer where Emperor Penguin Colony (EPC) optimization is used. Frequent patterns are mined using CANonical order (CAN) tree based Frequent Pattern (FP) growth which reduces execution time and frequent tree construction. User provides Mining_Request to the Hadoop and frequent patterns are mined for given query which is send back to the user. If user given query is not present in the CAN tree, then it sends Relevance Feedback as a recommendation to the user. Finally, we validate our proposed work performance with the previous works for succeeding metrics that are Execution Time, Response Time, Load Balancing Rate, and Scalability. (C) 2019 Published by Elsevier B.V.
引用
收藏
页码:111 / 122
页数:12
相关论文
共 50 条
  • [1] A Frequent Pattern Mining Algorithm Based on FP-growth without Generating Tree
    Tohidi, Hossein
    Ibrahim, Hamidah
    PROCEEDINGS OF KNOWLEDGE MANAGEMENT 5TH INTERNATIONAL CONFERENCE 2010, 2010, : 723 - 728
  • [2] The Research and Improvement Based on FP-Growth Data Mining Algorithm
    Yao, Quanzhu
    Gao, Xingxing
    Lei, Xueli
    Zhang, Tong
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON MODELING, SIMULATION AND OPTIMIZATION TECHNOLOGIES AND APPLICATIONS (MSOTA2016), 2016, 58 : 287 - 293
  • [3] Batch incremental processing for FP-tree construction using FP-Growth algorithm
    Shashikumar G. Totad
    R. B. Geeta
    P. V. G. D. Prasad Reddy
    Knowledge and Information Systems, 2012, 33 : 475 - 490
  • [4] Batch incremental processing for FP-tree construction using FP-Growth algorithm
    Totad, Shashikumar G.
    Geeta, R. B.
    Reddy, P. V. G. D. Prasad
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 33 (02) : 475 - 490
  • [5] An optimized fuzzy based FP-growth algorithm for mining temporal data
    Kumar, B. Praveen
    Padmavathy, T.
    Muthunagai, S. U.
    Paulraj, D.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (01) : 41 - 51
  • [6] Modified FP-Growth: An Efficient Frequent Pattern Mining Approach from FP-Tree
    Ahmed, Shafiul Alom
    Nath, Bhabesh
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT I, 2019, 11941 : 47 - 55
  • [7] TCM Constitution Analysis Method Based on Parallel FP-Growth Algorithm in Hadoop Framework
    Li, Mingzheng
    Lv, Xiaojuan
    Liu, Ye
    Wang, Lin
    Song, Jianqiang
    JOURNAL OF HEALTHCARE ENGINEERING, 2022, 2022
  • [8] A Parallel FP-growth Algorithm Based on GPU
    Jiang, Hao
    Meng, He
    2017 IEEE 14TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2017), 2017, : 97 - 102
  • [9] SQL based frequent pattern mining with FP-growth
    Shang, XQ
    Sattler, KU
    Geist, I
    APPLICATIONS OF DECLARATIVE PROGRAMMING AND KNOWLEDGE MANAGEMENT, 2005, 3392 : 32 - 46
  • [10] Distributed FP-ARMH Algorithm in Hadoop Map Reduce Framework
    Natarajan, Surendar
    Sehar, Sountharrajan
    2013 INTERNATIONAL CONFERENCE ON GREEN COMPUTING, COMMUNICATION AND CONSERVATION OF ENERGY (ICGCE), 2013, : 264 - 270