Distributed synthesized association mining for big transactional data

被引:4
|
作者
Pal, Amrit [1 ,2 ]
Kumar, Manish [2 ]
机构
[1] GLA Univ, Dept Comp Engn & Applicat, Mathura, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Prayagraj, India
关键词
Big Data; HDFS; MapReduce; Apriori; frequent itemset; association rule; DATA SETS; RULES; PATTERNS;
D O I
10.1007/s12046-020-01380-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data is increasing rapidly day by day along with the transactional database. Dividing this data and storing it in a distributed manner is an effective way for storage and retrieval. Mining such distributed data with minimum dependence between sub-problems is a crucial task. Finding frequent itemsets and corresponding association rules is a big challenge while considering the aggregation in a distributed environment. To overcome these challenges, we propose a distributed frequent itemset generation and association rule mining algorithm using MapReduce programming model. The proposed scheme generates frequent itemset and mine association rules using a synthesized distributed technique. The rules are mined in a distributed manner, and then weights are assigned to subsets of data and association rules. A proper mixture of association rules that are generated in distributed manner is done using a weighted approach. This paper presents a novel MapReduce-based synthesis approach, which can work well over a distributed storage of large amount of data.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] A Distributed Approach of Big Data Mining for Financial Fraud Detection in a Supply Chain
    Zhou, Hangjun
    Sun, Guang
    Fu, Sha
    Fan, Xiaoping
    Jiang, Wangdong
    Hu, Shuting
    Li, Lingjiao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 64 (02): : 1091 - 1105
  • [42] Big Data Mining: Managing the Costs of Data Mining
    Ganasan, Jaya R.
    2019 17TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2019, : 62 - 65
  • [43] Big Data Analytics in Association Rule Mining: A Systematic Literature Review
    Shahin, Mahtab
    Peious, Sijo Arakkal
    Sharma, Rahul
    Kaushik, Minakshi
    Ben Yahia, Sadok
    Shah, Syed Attique
    Draheim, Dirk
    2021 THE 3RD INTERNATIONAL CONFERENCE ON BIG DATA ENGINEERING AND TECHNOLOGY, BDET 2021, 2021, : 40 - 49
  • [44] Mining of Web Server Logs in a Distributed Cluster Using Big Data Technologies
    Savitha, K.
    Vijaya, M. S.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (01) : 137 - 142
  • [45] Mining 'Following' Patterns from Big but Sparsely Distributed Social Network Data
    Leung, Carson K.
    Middleton, Ryan
    Pazdor, Adam G. M.
    Won, Yeyoung
    2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 916 - 919
  • [46] A distributed frequent itemset mining algorithm using Spark for Big Data analytics
    Feng Zhang
    Min Liu
    Feng Gui
    Weiming Shen
    Abdallah Shami
    Yunlong Ma
    Cluster Computing, 2015, 18 : 1493 - 1501
  • [47] A distributed platform for intrusion detection system using data stream mining in a big data environment
    Schuartz, Fabio Cesar
    Fonseca, Mauro
    Munaretto, Anelise
    ANNALS OF TELECOMMUNICATIONS, 2024, 79 (7-8) : 507 - 521
  • [48] Estimation of Transactional Network Data Between Branch Offices using Transactional Big Data Throughout Japan
    Ogawa, Yoshiki
    Akiyama, Yuki
    Yoshihide, Sekimoto
    Shibasaki, Ryosuke
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 3916 - 3924
  • [49] DISTRIBUTED DATA MINING
    Fiolet, Valerie
    Toursel, Bernard
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2005, 6 (01): : 99 - 109
  • [50] Data field for mining big data
    Wang, Shuliang
    Li, Ying
    Wang, Dakui
    GEO-SPATIAL INFORMATION SCIENCE, 2016, 19 (02) : 106 - 118