MICF: An effective sanitization algorithm for hiding sensitive patterns on data mining

被引:14
|
作者
Li, Yu-Chiang
Yeh, Jieh-Shan
Chang, Chin-Chen
机构
[1] Natl Chung Cheng Univ, Dept Comp Sci & Informat Engn, Chiayi 62102, Taiwan
[2] Providence Univ, Dept Comp Sci & Informat Management, Taichung 433, Taiwan
[3] Feng Chia Univ, Dept Informat Engn & Comp Sci, Taichung 40724, Taiwan
关键词
data mining; association rule; privacy-preserving; sensitive rule hiding;
D O I
10.1016/j.aei.2006.12.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data mining mechanisms have widely been applied in various businesses and manufacturing companies across many industry sectors. Sharing data or sharing mined rules has become a trend among business partnerships, as it is perceived to be a mutually benefit way of increasing productivity for all parties involved. Nevertheless, this has also increased the risk of unexpected information leaks when releasing data. To conceal restrictive itemsets (patterns) contained in the source database, a sanitization process transforms the source database into a released database that the counterpart cannot extract sensitive rules from. The transformed result also conceals non-restrictive information as an unwanted event, called a side effect or the "misses cost". The problem of finding an optimal sanitization method, which conceals all restrictive itemsets but minimizes the misses cost, is NP-hard. To address this challenging problem, this study proposes the maximum item conflict first (MICF) algorithm. Experimental results demonstrate that the proposed method is effective, has a low sanitization rate, and can generally achieve a significantly lower misses cost than those achieved by the MinFIA, MaxFIA, IGA and Algo2b methods in several real and artificial datasets. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:269 / 280
页数:12
相关论文
共 50 条
  • [41] Finding Banded Patterns in Data: The Banded Pattern Mining Algorithm
    Abdullahi, Fatimah B.
    Coenen, Frans
    Martin, Russell
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, 2015, 9263 : 95 - 107
  • [42] The Algorithm for Data Mining Frequent Patterns over Sliding Window
    Zhao Xiao-Lei
    Huang Wei
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 759 - 762
  • [43] A data clustering algorithm for mining patterns from event logs
    Vaarandi, R
    PROCEEDINGS OF THE 3RD IEEE WORKSHOP ON IP OPERATIONS & MANAGEMENT (IPOM2003), 2003, : 119 - 126
  • [44] A sanitization approach for privacy preserving data mining on social distributed environment
    P. L. Lekshmy
    M. Abdul Rahiman
    Journal of Ambient Intelligence and Humanized Computing, 2020, 11 : 2761 - 2777
  • [45] A Scalable Data Analytics Algorithm for Mining Frequent Patterns from Uncertain Data
    MacKinnon, Richard Kyle
    Leung, Carson Kai-Sang
    Tanbeer, Syed K.
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 404 - 416
  • [46] A sanitization approach for privacy preserving data mining on social distributed environment
    Lekshmy, P. L.
    Rahiman, M. Abdul
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 11 (07) : 2761 - 2777
  • [47] Study of Cost-sensitive Ant Colony Data Mining Algorithm
    Song, Dingli
    Yang, Bingru
    Peng, Zhen
    Fang, Weiwei
    2009 INTERNATIONAL CONFERENCE ON INDUSTRIAL MECHATRONICS AND AUTOMATION, 2009, : 488 - 491
  • [48] A Novel Approach for Efficient Mining and Hiding of Sensitive Association Rule
    Patil, Suraj P.
    Patewar, M.
    3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [49] An Effective Method for Mining Negative Sequential Patterns From Data Streams
    Zhang, Nannan
    Ren, Xiaoqiang
    Dong, Xiangjun
    IEEE ACCESS, 2023, 11 : 31842 - 31854
  • [50] A Range Query Processing Algorithm Hiding Data Access Patterns in Outsourced Database Environment
    Kim, Hyeong-Il
    Kim, Hyeong-Jin
    Chang, Jae-Woo
    DATA MINING AND BIG DATA, DMBD 2016, 2016, 9714 : 434 - 446