Statistical limitations of sensitive itemset hiding methods

被引:0
|
作者
Shalini, Jangra [1 ,2 ,3 ]
Durga, Toshniwal [1 ]
Chris, Clifton [2 ,3 ]
机构
[1] IIT Roorkee, Dept Comp Sci & Engn, Roorkee 247667, Uttarakhand, India
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
关键词
Privacy preserving data mining; Itemset suppression; Heuristic approaches; Outlier detection;
D O I
10.1007/s10489-023-04781-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent Itemset Hiding has long been an area of study for privacy-preserving data mining. The goal is to alter a dataset so that it may be released without revealing particular sensitive aggregates (e.g., frequent itemsets or association rules.) Typically the approach is to remove items from transactions to reduce the support of the sensitive itemset(s) below a threshold, while minimizing the changes or impact on other frequent itemsets. In this paper, we ask if such hiding can be discovered: Do hiding methods lead to anomalies that suggest that a sensitive itemset likely existed in the dataset, and has been hidden? We show that a suppressed sensitive itemset may behave like an outlier among its neighboring itemsets after suppression, indicating that the dataset is likely altered. KL-divergence and ?(2)-divergence are used to calculate the difference between expected and actual probability distributions of itemsets for observing anomalous behavior. Experimental results on four datasets show that suppressed sensitive itemsets often stand out as the most significant outlier in many cases, irrespective of the victim item selection method. We propose two defensive approaches that counter this attack.
引用
收藏
页码:24275 / 24292
页数:18
相关论文
共 50 条
  • [31] LIMITATIONS OF STATISTICAL METHODS FOR PREDICTING PETROLEUM AND NATURAL GAS RESERVES AND AVAILABILITY
    RYAN, JM
    JOURNAL OF PETROLEUM TECHNOLOGY, 1966, 18 (03): : 281 - &
  • [32] Cost-sensitive active learning through statistical methods
    Wang, Min
    Lin, Yao
    Min, Fan
    Liu, Dun
    INFORMATION SCIENCES, 2019, 501 : 460 - 482
  • [33] Detectors for small-animal SPECT II - Statistical limitations and estimation methods
    Barrett, HH
    SMALL ANIMAL SPECT IMAGING, 2005, : 49 - 86
  • [34] Statistical attack resilient data hiding
    ECE Department, Stevens Institute of Technology, Burchard 208, Hoboken, NJ 07030, United States
    Int. J. Netw. Secur., 2007, 1 (112-120):
  • [35] The Limitations of Statistical Adjustment
    Sainani, Kristin
    PM&R, 2011, 3 (09) : 868 - 872
  • [36] Parallel and distributed methods for incremental frequent itemset mining
    Otey, ME
    Parthasarathy, S
    Wang, C
    Veloso, A
    Meira, W
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2004, 34 (06): : 2439 - 2450
  • [37] Frequent itemset mining over time-sensitive streams
    Li, Hai-Feng
    Zhang, Ning
    Zhu, Jian-Ming
    Cao, Huai-Hu
    Jisuanji Xuebao/Chinese Journal of Computers, 2012, 35 (11): : 2283 - 2293
  • [38] Strategies for Sensitive Association Rule Hiding
    Wang, Hui
    INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS II, PTS 1-3, 2013, 336-338 : 2203 - 2206
  • [39] Hiding sensitive information in eHealth datasets
    Wu, Jimmy Ming-Tai
    Srivastava, Gautam
    Jolfaei, Alireza
    Fournier-Viger, Philippe
    Lin, Jerry Chun-Wei
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 117 : 169 - 180
  • [40] Hiding Sensitive Association Rules by Sanitizing
    Wang, Hui
    MANUFACTURING PROCESS AND EQUIPMENT, PTS 1-4, 2013, 694-697 : 2317 - 2321