Map-optimize-reduce: CAN tree assisted FP-growth algorithm for clusters based FP mining on Hadoop

被引:21
|
作者
Ragaventhiran, J. [1 ]
Kavithadevi, M. K. [2 ]
机构
[1] Syed Ammal Engn Coll, Dept CSE, Ramanathapuram, India
[2] Thiagarajar Coll Engn, Dept CSE, Madurai, Tamil Nadu, India
关键词
Frequent pattern mining; Map-optimize-reduce; Clustering; Load balancing; CAN tree based FP growth; User query; FREQUENT PATTERNS; SEQUENTIAL PATTERNS; SKEWED DATA; MAPREDUCE;
D O I
10.1016/j.future.2019.09.041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the past era, Frequent Pattern Mining (FPM) is emerging as a significant approach to discover fascinating knowledge concealed in the data. However, preceding works failed to address the validation of FPM with user queries and also achieving better scalability and execution time is still bottleneck owing to difficulties in handling large dataset. To address this downside, our proposed work establishes FPM using extend version of MapReduce framework in Hadoop environment. Our proposed work comprises of five processes that are: 1) Preprocessing 2) Affinity Propagation (AP) based Clustering 3) Load Balancing 4) Map-Optimize-Reduce 5) Mining User Queries. Primarily, our proposed work performs preprocessing to remove data redundancy. To speed up the MapReduce framework, we propose AP clustering which generates effective clusters from the given dataset. Load balancing is executed to balance load among different blocks concerning where reputation is computed. To avoid oversight in scanning and minimal searching space in MapReduce, optimizer is included between Mapper and Reducer where Emperor Penguin Colony (EPC) optimization is used. Frequent patterns are mined using CANonical order (CAN) tree based Frequent Pattern (FP) growth which reduces execution time and frequent tree construction. User provides Mining_Request to the Hadoop and frequent patterns are mined for given query which is send back to the user. If user given query is not present in the CAN tree, then it sends Relevance Feedback as a recommendation to the user. Finally, we validate our proposed work performance with the previous works for succeeding metrics that are Execution Time, Response Time, Load Balancing Rate, and Scalability. (C) 2019 Published by Elsevier B.V.
引用
收藏
页码:111 / 122
页数:12
相关论文
共 50 条
  • [41] A Parallel FP-Growth Mining Algorithm with Load Balancing Constraints for Traffic Crash Data
    Yang, Yang
    Tian, Na
    Wang, Yunpeng
    Yuan, Zhenzhou
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2022, 17 (04)
  • [42] An expert recommendation algorithm based on Pearson correlation coefficient and FP-growth
    Feng, Wanli
    Zhu, Quanyin
    Zhuang, Jun
    Yu, Shimin
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S7401 - S7412
  • [43] Analysis of TCM prescription rule of stroke based on FP-growth algorithm
    Wang, Yan
    Qi, Hao
    Huang, Zhengzheng
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 3008 - 3010
  • [44] Advanced Data Mining of SSD Quality Based on FP-Growth Data Analysis
    Chang, Jieh-Ren
    Chen, You-Shyang
    Lin, Chien-Ku
    Cheng, Ming-Fu
    APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 15
  • [45] Retraction Note: Building the electronic evidence analysis model based on association rule mining and FP-growth algorithm
    Yilan Wu
    Jing Zhang
    Soft Computing, 2023, 27 (1) : 621 - 621
  • [46] RETRACTED ARTICLE: Building the electronic evidence analysis model based on association rule mining and FP-growth algorithm
    Yilan Wu
    Jing Zhang
    Soft Computing, 2020, 24 : 7925 - 7936
  • [47] Research on association rules of course grades based on parallel FP-Growth algorithm
    Wang, Xinyan
    Jiao, Guie
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2020, 20 (03) : 759 - 769
  • [48] Research and Application on Web Information Retrieval Based on Improved FP-Growth Algorithm
    JIAO Minghai~ 1
    2.School of Information Science and Engineering
    WuhanUniversityJournalofNaturalSciences, 2006, (05) : 1065 - 1068
  • [49] Distributed pruning optimization oriented FP-Growth method based on PSO algorithm
    Wei, Hong
    Luo, Qixing
    Chen, Zexi
    Chen, Yingzhe
    PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 1244 - 1248
  • [50] Optimization of FP-Growth algorithm based on cloud computing and computer big data
    Zhang, Baohua
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021, 12 (04) : 853 - 863