Statistical limitations of sensitive itemset hiding methods

被引:0
|
作者
Shalini, Jangra [1 ,2 ,3 ]
Durga, Toshniwal [1 ]
Chris, Clifton [2 ,3 ]
机构
[1] IIT Roorkee, Dept Comp Sci & Engn, Roorkee 247667, Uttarakhand, India
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
关键词
Privacy preserving data mining; Itemset suppression; Heuristic approaches; Outlier detection;
D O I
10.1007/s10489-023-04781-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent Itemset Hiding has long been an area of study for privacy-preserving data mining. The goal is to alter a dataset so that it may be released without revealing particular sensitive aggregates (e.g., frequent itemsets or association rules.) Typically the approach is to remove items from transactions to reduce the support of the sensitive itemset(s) below a threshold, while minimizing the changes or impact on other frequent itemsets. In this paper, we ask if such hiding can be discovered: Do hiding methods lead to anomalies that suggest that a sensitive itemset likely existed in the dataset, and has been hidden? We show that a suppressed sensitive itemset may behave like an outlier among its neighboring itemsets after suppression, indicating that the dataset is likely altered. KL-divergence and ?(2)-divergence are used to calculate the difference between expected and actual probability distributions of itemsets for observing anomalous behavior. Experimental results on four datasets show that suppressed sensitive itemsets often stand out as the most significant outlier in many cases, irrespective of the victim item selection method. We propose two defensive approaches that counter this attack.
引用
收藏
页码:24275 / 24292
页数:18
相关论文
共 50 条
  • [1] Statistical limitations of sensitive itemset hiding methods
    Jangra Shalini
    Toshniwal Durga
    Clifton Chris
    Applied Intelligence, 2023, 53 : 24275 - 24292
  • [2] Hiding Sensitive Itemsets Using Sibling Itemset Constraints
    Yildiz, Baris
    Kut, Alp
    Yilmaz, Reyat
    SYMMETRY-BASEL, 2022, 14 (07):
  • [3] Dynamic Itemset Hiding Algorithm for Multiple Sensitive Support Thresholds
    Ozturk, Ahmet Cumhur
    Ergenc, Belgin
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2018, 14 (02) : 37 - 59
  • [4] Closed Itemset based Sensitive Pattern Hiding for Improved Data Utility and Scalability
    Makkar, Himanshu
    Toshniwal, Durga
    Jangra, Shalini
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4026 - 4035
  • [5] A Frequent Itemset Hiding Toolbox
    Gkoulalas-Divanis, Aris
    Kagklis, Vasileios
    Stavropoulos, Elias C.
    ALGORITHMIC ASPECTS OF CLOUD COMPUTING (ALGOCLOUD 2018), 2019, 11409 : 169 - 182
  • [6] Solving the Sensitive Itemset Hiding Problem Whilst Minimizing Side Effects on a Sanitized Database
    Lee, Guanling
    Chen, Yi-Chun
    Peng, Sheng-Lung
    Lin, Jyun-Hao
    SECURITY-ENRICHED URBAN COMPUTING AND SMART GRID, 2011, 223 : 104 - 113
  • [7] A hybrid approach to frequent itemset hiding
    Gkoulalas-Divanis, Aris
    Verykios, Vassilios S.
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL I, PROCEEDINGS, 2007, : 297 - 304
  • [8] Frequent itemset hiding revisited: pushing hiding constraints into mining
    Vassilios S. Verykios
    Elias C. Stavropoulos
    Panteleimon Krasadakis
    Evangelos Sakkopoulos
    Applied Intelligence, 2022, 52 : 2539 - 2555
  • [9] Frequent itemset hiding revisited: pushing hiding constraints into mining
    Verykios, Vassilios S.
    Stavropoulos, Elias C.
    Krasadakis, Panteleimon
    Sakkopoulos, Evangelos
    APPLIED INTELLIGENCE, 2022, 52 (03) : 2539 - 2555
  • [10] Swapping-based Data Sanitization Method for Hiding Sensitive Frequent Itemset in Transaction Database
    Gunawan, Dedi
    Nugroho, Yusuf Sulistyo
    Maryam
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 693 - 701