Association rule mining algorithm based on Spark for pesticide transaction data analyses

被引:6
|
作者
Bai, Xiaoning [1 ,2 ]
Jia, Jingdun [1 ,3 ]
Wei, Qiwen [4 ]
Huang, Shuaiqi [1 ]
Du, Weicheng [5 ]
Gao, Wanlin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Inst Control Agrochem, Beijing 100125, Peoples R China
[3] Minist Sci & Technol, Torch Ctr, Beijing 100045, Peoples R China
[4] Natl Agr Technol Promot Ctr, Beijing 100125, Peoples R China
[5] Minist Agr & Rural Affairs, Informat Ctr, Beijing 100125, Peoples R China
基金
中国国家自然科学基金;
关键词
Spark; association rule mining; ICAMA algorithm; big data; pesticide regulation; MapReduce;
D O I
10.25165/j.ijabe.20191205.4881
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
With the development of smart agriculture, the accumulation of data in the field of pesticide regulation has a certain scale. The pesticide transaction data collected by the Pesticide National Data Center alone produces more than 10 million records daily. However, due to the backward technical means, the existing pesticide supervision data lack deep mining and usage. The Apriori algorithm is one of the classic algorithms in association rule mining, but it needs to traverse the transaction database multiple times, which will cause an extra IO burden. Spark is an emerging big data parallel computing framework with advantages such as memory computing and flexible distributed data sets. Compared with the Hadoop MapReduce computing framework, IO performance was greatly improved. Therefore, this paper proposed an improved Apriori algorithm based on Spark framework, ICAMA. The MapReduce process was used to support the candidate set and then to generate the candidate set. After experimental comparison, when the data volume exceeds 250 Mb, the performance of Spark-based Apriori algorithm was 20% higher than that of the traditional Hadoop-based Apriori algorithm, and with the increase of data volume, the performance improvement was more obvious.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [1] SOTARM: Size of transaction-based association rule mining algorithm
    Pandian, Asha
    Thaveethu, Jebarajan
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2017, 25 (01) : 278 - 291
  • [2] Data mining association rule algorithm based on Hadoop
    Huang Suyu
    PROCEEDINGS OF 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2015), 2015, : 349 - 352
  • [3] Research of Association Rule Algorithm based on Data Mining
    Song, Changxin
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2016, : 23 - 26
  • [4] The association rule algorithm with missing data in data mining
    Gerardo, BD
    Lee, J
    Lee, J
    Park, M
    Lee, M
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 1, 2004, 3043 : 97 - 105
  • [5] The Optimization of Association Rule Algorithm in Data Mining
    Fan, Yang
    ADVANCED DEVELOPMENT IN AUTOMATION, MATERIALS AND MANUFACTURING, 2014, 624 : 549 - 552
  • [6] Finding a Unique Association Rule Mining Algorithm Based on Data Characteristics
    Mazid, Mohammed M.
    Ali, A. B. M. Shawkat
    Tickle, Kevin S.
    PROCEEDINGS OF ICECE 2008, VOLS 1 AND 2, 2008, : 902 - 908
  • [7] Data Mining Application using Association Rule Mining ECLAT Algorithm Based on SPMF
    Reynaldo, Jason
    Tonara, David Boy
    3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION (ICESTI 2017), 2018, 164
  • [8] Association Rule Mining Based on Bat Algorithm
    Heraguemi, Kamel Eddine
    Kamel, Nadjet
    Drias, Habiba
    BIO-INSPIRED COMPUTING - THEORIES AND APPLICATIONS, BIC-TA 2014, 2014, 472 : 182 - 186
  • [9] Algorithm of Mining Association Rule Based on Matrix
    Lin, Zi-zhi
    Shu, Si-Hui
    Ding, Yun
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 786 - 791
  • [10] Based On The Possibility Of An Association Rule Mining Algorithm
    Xu, Zhi-Wei
    Zhang, Xue-Feng
    Zhang, Hai-Wang
    WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 187 - +