Statistical limitations of sensitive itemset hiding methods

被引:0
|
作者
Shalini, Jangra [1 ,2 ,3 ]
Durga, Toshniwal [1 ]
Chris, Clifton [2 ,3 ]
机构
[1] IIT Roorkee, Dept Comp Sci & Engn, Roorkee 247667, Uttarakhand, India
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Purdue Univ, CERIAS, W Lafayette, IN 47907 USA
关键词
Privacy preserving data mining; Itemset suppression; Heuristic approaches; Outlier detection;
D O I
10.1007/s10489-023-04781-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent Itemset Hiding has long been an area of study for privacy-preserving data mining. The goal is to alter a dataset so that it may be released without revealing particular sensitive aggregates (e.g., frequent itemsets or association rules.) Typically the approach is to remove items from transactions to reduce the support of the sensitive itemset(s) below a threshold, while minimizing the changes or impact on other frequent itemsets. In this paper, we ask if such hiding can be discovered: Do hiding methods lead to anomalies that suggest that a sensitive itemset likely existed in the dataset, and has been hidden? We show that a suppressed sensitive itemset may behave like an outlier among its neighboring itemsets after suppression, indicating that the dataset is likely altered. KL-divergence and ?(2)-divergence are used to calculate the difference between expected and actual probability distributions of itemsets for observing anomalous behavior. Experimental results on four datasets show that suppressed sensitive itemsets often stand out as the most significant outlier in many cases, irrespective of the victim item selection method. We propose two defensive approaches that counter this attack.
引用
收藏
页码:24275 / 24292
页数:18
相关论文
共 50 条
  • [21] Multi-level high utility-itemset hiding
    Nguyen, Loan T. T.
    Duong, Hoa
    Mai, An
    Vo, Bay
    PLOS ONE, 2025, 20 (02):
  • [22] Publishing Sensitive Transactions for Itemset Utility
    Xu, Yabo
    Fung, Benjamin C. M.
    Wang, Ke
    Fu, Ada W. C.
    Pei, Jian
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 1109 - +
  • [23] STATISTICAL-METHODS IN BIOLOGICAL ANTHROPOLOGY - CONCEPTS, DEVELOPMENTS AND LIMITATIONS
    CREWS, DE
    WAY, AB
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 1990, 81 (02) : 185 - 185
  • [24] THE LIMITATIONS OF MULTIVARIATE STATISTICAL-METHODS IN THE MENSURATION OF HUMAN MISERY
    HALL, W
    AUSTRALIAN AND NEW ZEALAND JOURNAL OF PSYCHIATRY, 1989, 23 (04): : 503 - 511
  • [25] BRIEF SURVEY OF SCOPE AND LIMITATIONS OF QUANTUM AND STATISTICAL MECHANICAL METHODS
    GELLER, M
    SHUGAR, D
    DRUGS UNDER EXPERIMENTAL AND CLINICAL RESEARCH, 1986, 12 (6-7) : 595 - 612
  • [26] An end-to-end knowledge graph solution to the frequent itemset hiding problem
    Krasadakis, Panteleimon
    Futia, Giuseppe
    Verykios, Vassilios S.
    Sakkopoulos, Evangelos
    INFORMATION SCIENCES, 2024, 672
  • [27] Statistical limitations in functional neuroimaging I. Non-inferential methods and statistical models
    Petersson, KM
    Nichols, TE
    Poline, JB
    Holmes, AP
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1999, 354 (1387) : 1239 - 1260
  • [28] LIMITATIONS OF STATISTICAL METHODS FOR PREDICTING PETROLEUM AND NATURAL GAS RESERVES AND AVAILABILITY
    RYAN, JM
    JOURNAL OF PETROLEUM TECHNOLOGY, 1965, 17 (09): : 1067 - &
  • [29] Key Concepts and Limitations of Statistical Methods for Evaluating Biomarkers of Kidney Disease
    Parikh, Chirag R.
    Thiessen-Philbrook, Heather
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2014, 25 (08): : 1621 - 1629
  • [30] Geostatistical and multivariate statistical methods for the assessment of polluted soils - merits and limitations
    Einax, JW
    Soldt, U
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1999, 46 (01) : 79 - 91