Hiding sensitive knowledge without side effects

被引:38
作者
Gkoulalas-Divanis, Aris [1 ]
Verykios, Vassilios S. [1 ]
机构
[1] Univ Thessaly, Dept Comp & Commun Engn, Volos 38221, Greece
关键词
Data mining; Association rule hiding; Borders of frequent itemsets; Parallelization;
D O I
10.1007/s10115-008-0178-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sensitive knowledge hiding in large transactional databases is one of the major goals of privacy preserving data mining. However, it is only recently that researchers were able to identify exact solutions for the hiding of knowledge, depicted in the form of sensitive frequent itemsets and their related association rules. Exact solutions allow for the hiding of vulnerable knowledge without any critical compromises, such as the hiding of nonsensitive patterns or the accidental uncovering of infrequent itemsets, amongst the frequent ones, in the sanitized outcome. In this paper, we highlight the process of border revision, which plays a significant role towards the identification of exact hiding solutions, and we provide efficient algorithms for the computation of the revised borders. Furthermore, we review two algorithms that identify exact hiding solutions, and we extend the functionality of one of them to effectively identify exact solutions for a wider range of problems (than its original counterpart). Following that, we introduce a novel framework for decomposition and parallel solving of hiding problems, which are handled by each of these approaches. This framework improves to a substantial degree the size of the problems that both algorithms can handle and significantly decreases their runtime. Through experimentation, we demonstrate the effectiveness of these approaches toward providing high quality knowledge hiding solutions.
引用
收藏
页码:263 / 299
页数:37
相关论文
共 39 条
[31]  
Saygin Y, 2001, SIGMOD REC, V30, P45, DOI 10.1145/604264.604271
[32]  
Sun XZ, 2005, FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P426
[33]  
Vaidya J., 2002, P 8 ACM SIGKDD INT C, P639, DOI [DOI 10.1145/775047.775142, 10.1145/775047.775142]
[34]   State-of-the-art in privacy preserving data mining [J].
Verykios, VS ;
Bertino, E ;
Fovin, IN ;
Provenza, LP ;
Saygin, Y ;
Theodoridis, Y .
SIGMOD RECORD, 2004, 33 (01) :50-57
[35]   Association rule hiding [J].
Verykios, VS ;
Elmagarmid, AK ;
Bertino, E ;
Saygin, Y ;
Dasseni, E .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (04) :434-447
[36]   Singular value decomposition based data distortion strategy for privacy protection [J].
Xu, Shuting ;
Zhang, Jun ;
Han, Dianwei ;
Wang, Jie .
KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 10 (03) :383-397
[37]   The distributed constraint satisfaction problem: Formalization and algorithms [J].
Yokoo, M ;
Durfee, EH ;
Ishida, T ;
Kuwabara, K .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1998, 10 (05) :673-685
[38]   Fast parallel association rule mining without candidacy generation [J].
Zaïane, OR ;
El-Hajj, M ;
Lu, P .
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, :665-668
[39]  
ZOU Q, 2002, KNOWL INF SYST, V4, P466