Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Distributed synthesized association mining for big transactional data
    Amrit Pal
    Manish Kumar
    Sādhanā, 2020, 45
  • [2] Hadoop based Mining of Distributed Association Rules from Big Data
    Bouraoui, Marwa
    Bouzouita, Ines
    Touzi, Amel Grissa
    2017 18TH INTERNATIONAL CONFERENCE ON SCIENCES AND TECHNIQUES OF AUTOMATIC CONTROL AND COMPUTER ENGINEERING (STA), 2017, : 185 - 190
  • [3] Distributed Big Advertiser Data Mining
    Bindra, Ashish
    Pokuri, Sreenivasulu
    Uppala, Krishna
    Teredesai, Ankur
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 914 - 914
  • [4] Parallel Mining Frequent Patterns over Big Transactional Data in Extended MapReduce
    Chen, Hui
    Lin, Tsau Young
    Zhang, Zhibing
    Zhong, Jie
    2013 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2013, : 43 - 48
  • [5] Mining association rules in big data with NGEP
    Yunliang Chen
    Fangyuan Li
    Junqing Fan
    Cluster Computing, 2015, 18 : 577 - 585
  • [6] Mining association rules in big data with NGEP
    Chen, Yunliang
    Li, Fangyuan
    Fan, Junqing
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 577 - 585
  • [7] Distributed Big Data Mining Platform for Smart Grid
    Wang, Zhixiang
    Wu, Bin
    Bai, Demeng
    Qin, Jiafeng
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2345 - 2354
  • [8] ClowdFlows: Online workflows for distributed big data mining
    Kranjc, Janez
    Orac, Roman
    Podpecan, Vid
    Lavrac, Nada
    Robnik-Sikonja, Marko
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 68 : 38 - 58
  • [9] Big Data Mining Using Public Distributed Computing
    Jurgelevicius, Albertas
    Sakalauskas, Leonidas
    INFORMATION TECHNOLOGY AND CONTROL, 2018, 47 (02): : 236 - 248
  • [10] Distributed Relationship Mining over Big Scholar Data
    Zhang, Da
    Kabuka, Mansur R.
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (01) : 354 - 365