MapReduce-based Frequent Itemset Mining for Analysis of Electronic Evidence

被引:0
|
作者
Jiang, Xueqing [1 ]
Sun, Guozi [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Comp, Nanjing, Jiangsu, Peoples R China
[2] Jiangsu High Technol Res Key Lab Wireless Sensor, Nanjing, Jiangsu, Peoples R China
关键词
computer crime; PISPO; ISPO-tree; MapReduce; frequent itemset; data mining; association rules; TREE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Association rules can mine the relevant evidence of computer crime from the massive data and association rules among data itemset, and further mine crime trends and connections among different crimes. They can help polices detect case and prevent crime with clues and criterions. Frequent itemset mining (FIM) plays a fundamental role in mining associations, correlations and many real-world data mining fields such as electronic evidence analysis area. FP-growth is the most famous FIM algorithm for discovering frequent patterns. As the data incrementing, the cost of time and space will be the bottleneck of FP-growth mining algorithms. One of the existing incremental frequent pattern mining algorithms called SPO-tree can perform incremental mining by a single scan for incremental mining. But how to apply this algorithm to the analysis of electronic evidence more effectively will become the focus of this paper. In the past research, little people take care of the item mined to the frequent item needing to update or inserted a little data. The past algorithms are not suit for this problem especially in forensic area. So, in this paper, we propose a novel parallelized algorithm called PISPO based on the cloud-computing framework MapReduce, which is widely used to cope with large-scale data and captures both the content and state to be distributed to the changed and original of the transactions dataset to SPO-tree.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] On A Visual Frequent Itemset Mining
    Lim, SeungJin
    2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, 2009, : 25 - 30
  • [22] Sequence-Growth : A Scalable and Effective Frequent Itemset Mining Algorithm for Big Data Based on MapReduce Framework
    Liang, Yen-hui
    Wu, Shiow-yang
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 393 - 400
  • [23] Fast Mining Algorithm of Frequent Itemset Based on Spark
    Ding J.-M.
    Li H.-B.
    Deng B.
    Jia L.-Y.
    You J.-G.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2446 - 2464
  • [24] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [25] A MapReduce-Based User Identification Algorithm in Web Usage Mining
    Srivastava, Mitali
    Garg, Rakhi
    Mishra, P. K.
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2018, 13 (02) : 11 - 23
  • [26] Frequent Itemset Mining Algorithm based on Sampling Method
    Li, Haifeng
    Zhang, Ning
    Zhang, Yuejin
    PROCEEDINGS OF THE 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND AUTOMATION ENGINEERING, 2016, 42 : 852 - 855
  • [27] An FPGA-Based Accelerator for Frequent Itemset Mining
    Zhang, Yan
    Zhang, Fan
    Jin, Zheming
    Bakos, Jason D.
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2013, 6 (01)
  • [28] Frequent Itemset Mining Algorithm Based on Linear Table
    Lu, Jun
    Xu, Wenhe
    Zhou, Kailong
    Guo, Zhicong
    JOURNAL OF DATABASE MANAGEMENT, 2023, 34 (01)
  • [29] A Distributed Frequent Itemset Mining Algorithm Based on Spark
    Gui, Feng
    Ma, Yunlong
    Zhang, Feng
    Liu, Min
    Li, Fei
    Shen, Weiming
    Bai, Hua
    PROCEEDINGS OF THE 2015 IEEE 19TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2015, : 271 - 275
  • [30] Model-based probabilistic frequent itemset mining
    Bernecker, Thomas
    Cheng, Reynold
    Cheung, David W.
    Kriegel, Hans-Peter
    Lee, Sau Dan
    Renz, Matthias
    Verhein, Florian
    Wang, Liang
    Zuefle, Andreas
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 37 (01) : 181 - 217