Swapping-based Data Sanitization Method for Hiding Sensitive Frequent Itemset in Transaction Database

被引:0
|
作者
Gunawan, Dedi [1 ]
Nugroho, Yusuf Sulistyo [1 ]
Maryam [1 ]
机构
[1] Univ Muhammadiyah Surakarta, Informat Engn Dept, Surakarta, Indonesia
关键词
Transaction database; data sanitization; data mining; sensitive frequent itemset; swapping-based method; FAST ALGORITHMS; PRIVACY;
D O I
10.14569/IJACSA.2021.0121179
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sharing a transaction database with other parties for exploring valuable information becomes more recognized by business institutions, i.e., retails and supermarkets. It offers various benefits for the institutions, such as finding customer shopping behavior and frequently bought items, known as frequent itemsets. Due to the importance of such information, some institutions may consider certain frequent itemsets as sensitive information that should be kept private. Therefore, prior to handling a database, the institutions should consider privacy preserving data mining (PPDM) techniques for preventing sensitive information breaches. Presently, several PPDM methods, such as item suppression-based methods and item insertion-based methods have been developed. Unfortunately, the methods result in significant changes to the database and induce some side effects such as hiding failure, significant data dissimilarity, misses cost, and artificial frequent itemset occurrence. In this paper, we propose a swapping-based data sanitization method that can hide the sensitive frequent itemset while at the same time it can minimize the side effects of the data sanitization process. Experimental results indicate that the proposed method outperforms existing methods in terms of minimizing the side effects.
引用
收藏
页码:693 / 701
页数:9
相关论文
共 50 条
  • [31] The GA-based algorithms for optimizing hiding sensitive itemsets through transaction deletion
    Chun-Wei Lin
    Tzung-Pei Hong
    Kuo-Tung Yang
    Shyue-Liang Wang
    Applied Intelligence, 2015, 42 : 210 - 230
  • [32] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
    Al-Bana, Mohamed Reda
    Farhan, Marwa Salah
    Othman, Nermin Abdelhakim
    DATA, 2022, 7 (01)
  • [33] A sliding window based algorithm for frequent closed itemset mining over data streams
    Nori, Fatemeh
    Deypir, Mahmood
    Sadreddini, Mohamad Hadi
    JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (03) : 615 - 623
  • [34] ERASE- EntRopy-based SAnitization of SEnsitive Data for Privacy Preservation
    Medsger, Jeffrey
    Srinivasan, Avinash
    2012 INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS, 2012, : 427 - 432
  • [35] Minimizing information loss in shared data: Hiding frequent patterns with multiple sensitive support thresholds
    Bostanoglu, Belgin Ergenc
    Ozturk, Ahmet Cumhur
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (04) : 309 - 323
  • [36] An Efficient Sliding Window Based Algorithm for Adaptive Frequent Itemset Mining over Data Streams
    Deypir, Mhmood
    Sadreddini, Mohammad Hadi
    Taahomi, Mehran
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2013, 29 (05) : 1001 - 1020
  • [37] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
    Sethi, Krishan Kumar
    Ramesh, Dharavath
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (08): : 3652 - 3668
  • [38] An Efficient Outlier Detection Approach Over Uncertain Data Stream Based on Frequent Itemset Mining
    Hao, Shangbo
    Cai, Saihua
    Sun, Ruizhi
    Li, Sicong
    INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (01): : 34 - 46
  • [39] A New Sliding Window Based Algorithm for Frequent Closed Itemset Mining Over Data Streams
    Nori, Fatemeh
    Deypir, Mahmood
    Sadreddini, Mohamad Hadi
    Ziarati, Korosh
    2011 1ST INTERNATIONAL ECONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2011, : 249 - 253
  • [40] Feature selection based on closed frequent itemset mining: A case study on SAGE data classification
    Seeja, K. R.
    NEUROCOMPUTING, 2015, 151 : 1027 - 1032