Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] EAFIM: efficient apriori-based frequent itemset mining algorithm on Spark for big transactional data
    Shashi Raj
    Dharavath Ramesh
    M. Sreenu
    Krishan Kumar Sethi
    Knowledge and Information Systems, 2020, 62 : 3565 - 3583
  • [32] Data Mining Technique for Reduction of Association Rules in Distributed System
    Waghamare, Bhagyashri
    Bodhe, Yogesh
    2016 INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND DYNAMIC OPTIMIZATION TECHNIQUES (ICACDOT), 2016, : 415 - 418
  • [33] Mining association rules in non-transactional databases
    Lee, Ho-Jong
    Lim, Seung-Hwan
    Oh, Hyun-Kyo
    Cho, Jinsoo
    Kim, Sang-Wook
    Cha, Jaehyuk
    Lee, Junghoon
    Kim, Hanil
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (11B): : 5055 - 5069
  • [34] Dataless Data Mining: Association Rules-based Distributed Privacy-preserving Data Mining
    Ashok, Vikas G.
    Navuluri, K.
    Alhafdhi, A.
    Mukkamala, R.
    2015 12TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY - NEW GENERATIONS, 2015, : 615 - 620
  • [35] Mining "Big Data" using Big Data Services
    Reips, Ulf-Dietrich
    Matzat, Uwe
    INTERNATIONAL JOURNAL OF INTERNET SCIENCE, 2014, 9 (01) : 1 - 8
  • [36] Mining association rules on Big Data through MapReduce genetic programming
    Padillo, F.
    Luna, J. M.
    Herrera, F.
    Ventura, S.
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2018, 25 (01) : 31 - 48
  • [37] An evolutionary algorithm for mining rare association rules: a Big Data approach
    Padillo, F.
    Luna, J. M.
    Ventura, S.
    2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 2007 - 2014
  • [38] Online education big data mining method based on association rules
    Zhang N.
    International Journal of Information and Communication Technology, 2024, 24 (03) : 262 - 272
  • [39] A distributed frequent itemset mining algorithm using Spark for Big Data analytics
    Zhang, Feng
    Liu, Min
    Gui, Feng
    Shen, Weiming
    Shami, Abdallah
    Ma, Yunlong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (04): : 1493 - 1501
  • [40] Dynamic Distributed and Parallel Machine Learning algorithms for big data mining processing
    Djafri, Laouni
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 56 (04) : 558 - 601