Association rule hiding

被引:240
作者
Verykios, VS
Elmagarmid, AK
Bertino, E
Saygin, Y
Dasseni, E
机构
[1] Comp Technol Inst, Data & Knowledge Engn Grp, Patras 26221, Greece
[2] Hewlett Packard Corp, Off Strategy & Technol, Palo Alto, CA USA
[3] Univ Milan, Dept Comp Sci, I-20135 Milan, Italy
[4] Sabanci Univ, Fac Engn & Nat Sci, TR-34956 Istanbul, Turkey
[5] TXT E Solut SpA, I-20126 Milan, Italy
基金
美国国家科学基金会;
关键词
privacy preserving data mining; association rule mining; sensitive rule hiding;
D O I
10.1109/TKDE.2004.1269668
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.
引用
收藏
页码:434 / 447
页数:14
相关论文
共 11 条
[1]  
ADAM NR, 1989, COMPUT SURV, V21, P515, DOI 10.1145/76894.76895
[2]  
Aggarwal C. C., 2001, P ACM PODS C
[3]  
AGRAWAL R, 2000, P ACM SIGMOD C
[4]  
Atallah M., 1999, WORKSH KNOWL DAT ENG
[5]  
CLIFTON C, 1996, P 1996 ACM WORKSH DA
[6]  
CLIFTON C, 1999, P 13 IFIP WG11 3 C D
[7]  
FAYYAD UM, 1996, ADV KNOWLEDGE DISCOV
[8]  
JOHNSTEN T, 1999, P 13 IFIP WG11 3 C D
[9]   Inference in MLS database systems [J].
Marks, DG .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (01) :46-55
[10]  
OLeary D. E., 1991, P 1 INT C KNOWL DISC, P107