Map-optimize-reduce: CAN tree assisted FP-growth algorithm for clusters based FP mining on Hadoop

被引:21
|
作者
Ragaventhiran, J. [1 ]
Kavithadevi, M. K. [2 ]
机构
[1] Syed Ammal Engn Coll, Dept CSE, Ramanathapuram, India
[2] Thiagarajar Coll Engn, Dept CSE, Madurai, Tamil Nadu, India
关键词
Frequent pattern mining; Map-optimize-reduce; Clustering; Load balancing; CAN tree based FP growth; User query; FREQUENT PATTERNS; SEQUENTIAL PATTERNS; SKEWED DATA; MAPREDUCE;
D O I
10.1016/j.future.2019.09.041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the past era, Frequent Pattern Mining (FPM) is emerging as a significant approach to discover fascinating knowledge concealed in the data. However, preceding works failed to address the validation of FPM with user queries and also achieving better scalability and execution time is still bottleneck owing to difficulties in handling large dataset. To address this downside, our proposed work establishes FPM using extend version of MapReduce framework in Hadoop environment. Our proposed work comprises of five processes that are: 1) Preprocessing 2) Affinity Propagation (AP) based Clustering 3) Load Balancing 4) Map-Optimize-Reduce 5) Mining User Queries. Primarily, our proposed work performs preprocessing to remove data redundancy. To speed up the MapReduce framework, we propose AP clustering which generates effective clusters from the given dataset. Load balancing is executed to balance load among different blocks concerning where reputation is computed. To avoid oversight in scanning and minimal searching space in MapReduce, optimizer is included between Mapper and Reducer where Emperor Penguin Colony (EPC) optimization is used. Frequent patterns are mined using CANonical order (CAN) tree based Frequent Pattern (FP) growth which reduces execution time and frequent tree construction. User provides Mining_Request to the Hadoop and frequent patterns are mined for given query which is send back to the user. If user given query is not present in the CAN tree, then it sends Relevance Feedback as a recommendation to the user. Finally, we validate our proposed work performance with the previous works for succeeding metrics that are Execution Time, Response Time, Load Balancing Rate, and Scalability. (C) 2019 Published by Elsevier B.V.
引用
收藏
页码:111 / 122
页数:12
相关论文
共 50 条
  • [31] CUSTOMIZING FP-GROWTH ALGORITHM TO PARALLEL MINING WITH CHARM plus plus LIBRARY
    Puscian, Marek
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH ENERGY PHYSICS EXPERIMENTS 2017, 2017, 10445
  • [32] Association Rule Mining using FP-Growth Algorithm to Prevent Maverick Buying
    Isa, Norulhidayah
    Neddy, Siti Khadijah
    Mohamed, Norizan
    11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021), 2021, : 77 - 81
  • [33] Enhancing Medical Big Data Analytics: A Hadoop and FP-Growth Algorithm Approach for Cloud Computing
    Hu, Rong
    Yang, Xueling
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2025, 32 (01): : 247 - 254
  • [34] Enhancing Medical Big Data Analytics: A Hadoop and FP-Growth Algorithm Approach for Cloud Computing
    Hu, Rong
    Yang, Xueling
    Tehnicki Vjesnik, 32 (01): : 247 - 254
  • [35] An Improved FP-growth Algorithm Based on Compound Single Linked List
    Ding Zhenguo
    Wei Qinqin
    Ding Xianhua
    ICIC 2009: SECOND INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTING SCIENCE, VOL 1, PROCEEDINGS: COMPUTING SCIENCE AND ITS APPLICATION, 2009, : 351 - 353
  • [36] Data Analysis of Tyre Quality Based on Improved FP-Growth Algorithm
    Li M.
    Ding D.
    Yi Y.
    Zhongguo Jixie Gongcheng/China Mechanical Engineering, 2019, 30 (02): : 244 - 251
  • [37] An expert recommendation algorithm based on Pearson correlation coefficient and FP-growth
    Wanli Feng
    Quanyin Zhu
    Jun Zhuang
    Shimin Yu
    Cluster Computing, 2019, 22 : 7401 - 7412
  • [38] Discovery of Incremental Association Rules Based on a New FP-Growth Algorithm
    Kreesuradej, Worapoj
    Thurachon, Wannasiri
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 184 - 188
  • [39] An improved parallel FP-growth algorithm based on Spark and its application
    Miao, Yuhang
    Lin, Jinxing
    Xu, Nuo
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 3793 - 3797
  • [40] Research on Association Rule Algorithm Based on Distributed and Weighted FP-Growth
    Wang, Huaibin
    Liu, Yuanchao
    Wang, Chundong
    ADVANCES IN MULTIMEDIA, SOFTWARE ENGINEERING AND COMPUTING, VOL 1, 2011, 128 : 133 - 138