Association rule mining algorithm based on Spark for pesticide transaction data analyses

被引:6
|
作者
Bai, Xiaoning [1 ,2 ]
Jia, Jingdun [1 ,3 ]
Wei, Qiwen [4 ]
Huang, Shuaiqi [1 ]
Du, Weicheng [5 ]
Gao, Wanlin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Inst Control Agrochem, Beijing 100125, Peoples R China
[3] Minist Sci & Technol, Torch Ctr, Beijing 100045, Peoples R China
[4] Natl Agr Technol Promot Ctr, Beijing 100125, Peoples R China
[5] Minist Agr & Rural Affairs, Informat Ctr, Beijing 100125, Peoples R China
基金
中国国家自然科学基金;
关键词
Spark; association rule mining; ICAMA algorithm; big data; pesticide regulation; MapReduce;
D O I
10.25165/j.ijabe.20191205.4881
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
With the development of smart agriculture, the accumulation of data in the field of pesticide regulation has a certain scale. The pesticide transaction data collected by the Pesticide National Data Center alone produces more than 10 million records daily. However, due to the backward technical means, the existing pesticide supervision data lack deep mining and usage. The Apriori algorithm is one of the classic algorithms in association rule mining, but it needs to traverse the transaction database multiple times, which will cause an extra IO burden. Spark is an emerging big data parallel computing framework with advantages such as memory computing and flexible distributed data sets. Compared with the Hadoop MapReduce computing framework, IO performance was greatly improved. Therefore, this paper proposed an improved Apriori algorithm based on Spark framework, ICAMA. The MapReduce process was used to support the candidate set and then to generate the candidate set. After experimental comparison, when the data volume exceeds 250 Mb, the performance of Spark-based Apriori algorithm was 20% higher than that of the traditional Hadoop-based Apriori algorithm, and with the increase of data volume, the performance improvement was more obvious.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [21] Rare association rule mining via transaction clustering
    School of Computing Science and Mathematics, Auckland University of Technology, New Zealand
    Conf. Res. Pract. Inf. Technol. Ser., 2008, (87-94):
  • [22] A data mining algorithm for fuzzy transaction data
    Chin-Yuan Chen
    Gin-Shuh Liang
    Yuhling Su
    Mao-Sheng Liao
    Quality & Quantity, 2014, 48 : 2963 - 2971
  • [23] A data mining algorithm for fuzzy transaction data
    Chen, Chin-Yuan
    Liang, Gin-Shuh
    Su, Yuhling
    Liao, Mao-Sheng
    QUALITY & QUANTITY, 2014, 48 (06) : 2963 - 2971
  • [24] AClass: Classification algorithm based on association rule mining
    Computational Science and Engineering Department, Istanbul Technical University , Maslak 34469, Turkey
    WSEAS Trans. Inf. Sci. Appl., 2006, 3 (570-575):
  • [25] Association rule mining algorithm based on Privacy preserving
    Sun Wei
    Wang Yonggui
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 4, 2010, : 140 - 143
  • [26] A TSP based algorithm for mining fuzzy association rule
    Qiang, Yu
    Hu Yunfa
    Li, Xu
    CHINESE JOURNAL OF ELECTRONICS, 2008, 17 (01): : 127 - 129
  • [28] Association Rule Mining Algorithm Based on Matching Array
    Li, Liangjun
    Zhang, Jionghui
    Che, Yuanyuan
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 384 - 387
  • [29] An association rule hiding algorithm for privacy preserving data mining
    Srinivasa Rao, K.
    Mandhala, Venkata Naresh
    Bhattacharyya, Debnath
    Kim, Tai-Hoon
    International Journal of Control and Automation, 2014, 7 (10): : 393 - 404
  • [30] Apriori Algorithm for Association Rule Mining in High Dimensional Data
    Harikumar, Sandhya
    Dilipkumar, Divya Usha
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON DATA SCIENCE & ENGINEERING (ICDSE), 2016, : 115 - 120