The Effect of Clustering on Data Privacy

被引:6
|
作者
Canbay, Pelin [1 ]
Sever, Hayri [1 ]
机构
[1] Hacettepe Univ, Dept Comp Engn, Ankara, Turkey
关键词
data privacy; privacy preserving; anonymization; clustering; data diversity;
D O I
10.1109/ICMLA.2015.198
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The data obtained by various organizations provide opportunities for generating solutions in the future. It is essential that, the accurate data must be sharable with research communities and scientists in order to improve quality of life. However, accurate records of personal data include sensitive information about individuals. Hence sharing these records without applying any anonymization criteria paves the way for disclosure of personal privacy. In an effort to protect personal privacy, Privacy-Preserving Data Mining (PPDM) and Privacy-Preserving Data Publishing (PPDP) approaches have been studied extensively. Numerous works have been dedicated to diversifying techniques for de-identification or anonymization of identifiable datasets, but there is an important trade-off between data loss and data privacy. While original data anonymized, it exposed to information loss. In order to minimize information loss, the anonymization algorithms discard keeping diversity. In this study, we proposed an approach that uses a clustering algorithm as a pre-process for privacy preserving methods to improve the diversity of anonymized data. In addition, the effect of clustering on anonymization was evaluated by using original and clustered form of a real world dataset. The results are evaluated with the aspect of usability in scientific works and it was observed that a clustering algorithm and an affective anonymization algorithm must be used in privacy preservation approaches in order to keep diversity of the original datasets.
引用
收藏
页码:277 / 282
页数:6
相关论文
共 50 条
  • [1] Clustering for Data Privacy and Classification Tasks
    Schebesch, Klaus B.
    Stecking, Ralf
    OPERATIONS RESEARCH PROCEEDINGS 2013, 2014, : 397 - +
  • [2] Towards comprehensive privacy protection in data clustering
    Zhang, Nan
    Advances in Knowledge Discovery and Data Mining, Proceedings, 2007, 4426 : 1096 - 1104
  • [3] Privacy preserving clustering on horizontally partitioned data
    Inan, Ali
    Kaya, Selim V.
    Saygin, Yuecel
    Savas, Erkay
    Hintoglu, Ayca A.
    Levi, Albert
    DATA & KNOWLEDGE ENGINEERING, 2007, 63 (03) : 646 - 666
  • [4] Privacy-preserving clustering of data streams
    Chao, Ching-Ming
    Chen, Po-Zung
    Sun, Chu-Hao
    Tamkang Journal of Science and Engineering, 2010, 13 (03): : 349 - 358
  • [5] A unified framework for privacy preserving data clustering
    Li, Wenye
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8834 : 319 - 326
  • [6] A Unified Framework for Privacy Preserving Data Clustering
    Li, Wenye
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT I, 2014, 8834 : 319 - 326
  • [7] Privacy-Preserving Clustering of Data Streams
    Chao, Ching-Ming
    Chen, Po-Zung
    Sun, Chu-Hao
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2010, 13 (03): : 349 - 358
  • [8] Privacy-preserving mechanism for mixed data clustering with local differential privacy
    Yuan, Liujie
    Zhang, Shaobo
    Zhu, Gengming
    Alinani, Karim
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (19):
  • [9] Data privacy protection in multi-party clustering
    Yang, Weijia
    Huang, Shangteng
    DATA & KNOWLEDGE ENGINEERING, 2008, 67 (01) : 185 - 199
  • [10] Efficient and Privacy Preserving Clustering Algorithm for Spatiotemporal Data
    Mehmood, Abid
    Natgunanathan, Iynkaran
    Xiang, Yong
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2024, 23 (02) : 967 - 992