Generalization-based privacy preservation and discrimination prevention in data publishing and mining

被引:35
|
作者
Hajian, Sara [1 ]
Domingo-Ferrer, Josep [1 ]
Farras, Oriol [1 ]
机构
[1] Univ Rovira & Virgili, Dept Comp Engn & Maths, UNESCO Chair Data Privacy, E-43007 Tarragona, Spain
关键词
Data mining; Anti-discrimination; Privacy; Generalization; K-ANONYMITY;
D O I
10.1007/s10618-014-0346-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Living in the information society facilitates the automatic collection of huge amounts of data on individuals, organizations, etc. Publishing such data for secondary analysis (e.g. learning models and finding patterns) may be extremely useful to policy makers, planners, marketing analysts, researchers and others. Yet, data publishing and mining do not come without dangers, namely privacy invasion and also potential discrimination of the individuals whose data are published. Discrimination may ensue from training data mining models (e.g. classifiers) on data which are biased against certain protected groups (ethnicity, gender, political preferences, etc.). The objective of this paper is to describe how to obtain data sets for publication that are: (i) privacy-preserving; (ii) unbiased regarding discrimination; and (iii) as useful as possible for learning models and finding patterns. We present the first generalization-based approach to simultaneously offer privacy preservation and discrimination prevention. We formally define the problem, give an optimal algorithm to tackle it and evaluate the algorithm in terms of both general and specific data analysis metrics (i.e. various types of classifiers and rule induction algorithms). It turns out that the impact of our transformation on the quality of data is the same or only slightly higher than the impact of achieving just privacy preservation. In addition, we show how to extend our approach to different privacy models and anti-discrimination legal concepts.
引用
收藏
页码:1158 / 1188
页数:31
相关论文
共 50 条
  • [21] A Comparative Survey on Privacy Preservation And Privacy measuring Techniques in Data Publishing
    Kumar, Atul
    Gyanchandani, Manasi
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 1902 - 1906
  • [22] A Comparative Review of Privacy Preservation Techniques in Data Publishing
    Kumar, Atul
    Gyanchandani, Manasi
    Jain, Priyank
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2018), 2018, : 1027 - 1032
  • [23] Sensitive Label Privacy Preservation with Anatomization for Data Publishing
    Yao, Lin
    Chen, Zhenyu
    Wang, Xin
    Liu, Dong
    Wu, Guowei
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (02) : 904 - 917
  • [24] Privacy Preservation for Trajectory Data Publishing and Heuristic Approach
    Harnsamut, Nattapon
    Natwichai, Juggapong
    ADVANCES IN NETWORK-BASED INFORMATION SYSTEMS, NBIS-2017, 2018, 7 : 787 - 797
  • [25] DATA MINING AS A TOOL IN PRIVACY-PRESERVING DATA PUBLISHING
    Sramka, Michal
    NILCRYPT 10, 2010, 45 : 151 - 159
  • [26] A Survey on Privacy Issues and Privacy Preservation in Spatial Data Mining
    Kamakshi, P.
    2014 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2014), 2014, : 1759 - 1762
  • [27] A Survey of Privacy Preserving Data Publishing using Generalization and Suppression
    Xu, Yang
    Ma, Tinghuai
    Tang, Meili
    Tian, Wei
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2014, 8 (03): : 1103 - 1116
  • [28] An Analysis of Privacy Preservation Techniques in Data Mining
    Sachan, Abhishek
    Roy, Devshri
    Arun, P., V
    ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY, VOL 3, 2013, 178 : 119 - +
  • [29] A Method for Preservation of Privacy in Data Mining Processes
    Liang, Danyan
    Busch, Peter
    Picoto, Winnie
    VISION 2020: SUSTAINABLE ECONOMIC DEVELOPMENT AND APPLICATION OF INNOVATION MANAGEMENT, 2018, : 203 - 223
  • [30] Preservation of Privacy in Data Mining by using PCA Based Perturbation Technique
    Gokulnath, C.
    Priyan, M. K.
    Balan, E. Vishnu
    Prabha, K. P. Rama
    Jeyanthi, R.
    2015 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES AND MANAGEMENT FOR COMPUTING, COMMUNICATION, CONTROLS, ENERGY AND MATERIALS (ICSTM), 2015, : 202 - 206