Efficient multivariate data-oriented microaggregation

被引:0
|
作者
Josep Domingo-Ferrer
Antoni Martínez-Ballesté
Josep Maria Mateo-Sanz
Francesc Sebé
机构
[1] Rovira i Virgili University of Tarragona,Department of Computer Engineering & Maths
[2] Rovira i Virgili University of Tarragona,Statistics Group
来源
The VLDB Journal | 2006年 / 15卷
关键词
Statistical databases; Privacy; Anonymity; Statistical disclosure control; Microaggregation; Microdata protection;
D O I
暂无
中图分类号
学科分类号
摘要
Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released while preserving the privacy of the underlying individuals. The principle of microaggregation is to aggregate original database records into small groups prior to publication. Each group should contain at least k records to prevent disclosure of individual information, where k is a constant value preset by the data protector. Recently, microaggregation has been shown to be useful to achieve k-anonymity, in addition to it being a good masking method. Optimal microaggregation (with minimum within-groups variability loss) can be computed in polynomial time for univariate data. Unfortunately, for multivariate data it is an NP-hard problem. Several heuristic approaches to microaggregation have been proposed in the literature. Heuristics yielding groups with fixed size k tends to be more efficient, whereas data-oriented heuristics yielding variable group size tends to result in lower information loss. This paper presents new data-oriented heuristics which improve on the trade-off between computational complexity and information loss and are thus usable for large datasets.
引用
收藏
页码:355 / 369
页数:14
相关论文
共 50 条
  • [31] Extracting data from WSNs: A data-oriented approach
    Schreiber, Fabio A.
    Camplani, Romolo
    Rota, Guido
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7200 LNCS : 357 - 373
  • [32] Teaching Bayes' rule: A data-oriented approach
    Albert, J
    AMERICAN STATISTICIAN, 1997, 51 (03): : 247 - 253
  • [33] Design of a Data-Oriented Cascade Control System
    Kinoshita, Takuya
    Yamamoto, Toru
    Samavedham, Lakshminarayanan
    2017 6TH INTERNATIONAL SYMPOSIUM ON ADVANCED CONTROL OF INDUSTRIAL PROCESSES (ADCONIP), 2017, : 365 - 370
  • [34] Astroinformatics: data-oriented astronomy research and education
    Kirk D. Borne
    Earth Science Informatics, 2010, 3 : 5 - 17
  • [35] Data-oriented multi-index Hashing
    Ma, Yanping
    Ji, Guangrong
    Zou, Hailin
    Xie, Hongtao
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2015, 42 (04): : 159 - 164
  • [36] Data-Oriented Mobile Crowdsensing: A Comprehensive Survey
    Liu, Yutong
    Kong, Linghe
    Chen, Guihai
    IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2019, 21 (03): : 2849 - 2885
  • [37] Study on data-oriented it audit used in China
    Chen, W
    Wang, H
    Zhu, WM
    Proceedings of the 11th Joint International Computer Conference, 2005, : 666 - 669
  • [38] Data-oriented protein kinase drug discovery
    Xerxa, Elena
    Bajorath, Juergen
    EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY, 2024, 271
  • [39] Interactive Visualization of Data-Oriented XML Documents
    Chmelar, Petr
    Hernych, Radim
    Kubicek, Daniel
    ADVANCES IN COMPUTER AND INFORMATIOM SCIENCES AND ENGINEERING, 2008, : 390 - 393
  • [40] Astroinformatics: data-oriented astronomy research and education
    Borne, Kirk D.
    EARTH SCIENCE INFORMATICS, 2010, 3 (1-2) : 5 - 17