Anonymization of moving objects databases by clustering and perturbation

被引:129
|
作者
Abul, Osman [2 ]
Bonchi, Francesco [1 ]
Nanni, Mirco [3 ]
机构
[1] Yahoo Res, Barcelona 08018, Spain
[2] TOBB Univ Econ & Technol, Dept Comp Engn, Ankara, Turkey
[3] CNR, ISTI, Area Ric Pisa, Pisa KDD Lab, I-56124 Pisa, Italy
关键词
Moving objects databases; Trajectories; Anonymity; Uncertainty; Clustering; MICROAGGREGATION; ANONYMITY;
D O I
10.1016/j.is.2010.05.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Preserving individual privacy when publishing data is a problem that is receiving increasing attention. Thanks to its simplicity the concept of k-anonymity, introduced by Samarati and Sweeney [1], established itself as one fundamental principle for privacy preserving data publishing. According to the k-anonymity principle, each release of data must be such that each individual is indistinguishable from at least k-1 other individuals. In this article we tackle the problem of anonymization of moving objects databases. We propose a novel concept of k-anonymity based on co-localization, that exploits the inherent uncertainty of the moving object's whereabouts. Due to sampling and imprecision of the positioning systems (e.g., GPS), the trajectory of a moving object is no longer a polyline in a three-dimensional space, instead it is a cylindrical volume, where its radius delta represents the possible location imprecision: we know that the trajectory of the moving object is within this cylinder, but we do not know exactly where. If another object moves within the same cylinder they are indistinguishable from each other. This leads to the definition of (k,delta)-anonymity for moving objects databases. We first characterize the (k,delta)-anonymity problem, then we recall NWA (Never Walk Alone), a method that we introduced in [2] based on clustering and spatial perturbation. Starting from a discussion on the limits of NWA we develop a novel clustering method that, being based on EDR distance [3], has the important feature of being time-tolerant. As a consequence it perturbs trajectories both in space and time. The novel method, named W4M (Wait for Me), is empirically shown to produce higher quality anonymization than NWA, at the price of higher computational requirements. Therefore, in order to make W4M scalable to large datasets, we introduce two variants based on a novel (and computationally cheaper) time-tolerant distance function, and on chunking. All the variants of W4M(1) are empirically evaluated in terms of data quality and efficiency, and thoroughly compared to their predecessor NWA.(2) Data quality is assessed both by means of objective measures of information distortion, and by more usability oriented measure, i.e., by comparing the results of (i) spatio-temporal range queries and (ii) frequent pattern mining, executed on the original database and on the (k,delta)-anonymized one. Experimental results over both real-world and synthetic mobility data confirm that, for a wide range of values of delta and k, the relative distortion introduced by our anonymization methods is kept low. Moreover, the techniques introduced to make W4M scalable to large datasets, achieve their goal without giving up data quality in the anonymization process. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:884 / 910
页数:27
相关论文
共 50 条
  • [1] Moving objects in networks Databases
    de Almeida, Victor Teixeira
    CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2006, 2006, 4254 : 75 - 85
  • [2] Algorithms for Moving Objects Databases
    Cotelo Lema, J.A. (gueting@fernuni-hagen.de), 1600, Oxford University Press (46):
  • [3] Algorithms for moving objects databases
    Lema, JAC
    Forlizzi, L
    Güting, RH
    Nardelli, E
    Schneider, M
    COMPUTER JOURNAL, 2003, 46 (06): : 680 - 712
  • [4] Modeling Uncertainty in Moving Objects Databases
    Alkobaisi, Shayma
    Bae, Wan D.
    Narayanappa, Sada
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (12): : 2440 - 2459
  • [5] Moving objects databases: Issues and solutions
    Wolfson, O
    Xu, B
    Chamberlain, S
    Jiang, L
    TENTH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT - PROCEEDINGS, 1998, : 111 - 122
  • [6] DOMINO: Databases fOr MovINg Objects tracking
    Wolfson, O
    Sistla, P
    Xu, B
    Zhou, JT
    Chamberlain, S
    SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999: SIGMOD99: PROCEEDINGS OF THE 1999 ACM SIGMOD - INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 1999, : 547 - 549
  • [7] Managing uncertainty in Moving Objects Databases
    Trajcevski, G
    Wolfson, O
    Hinrichs, K
    Chamberlain, S
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2004, 29 (03): : 463 - 507
  • [8] Moving objects Databases in space applications
    Kilimi, Perihan
    Kalipsiz, Oya
    2007 3RD INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN SPACE TECHNOLOGIES, VOLS 1 AND 2, 2007, : 106 - +
  • [9] The geometry of uncertainty in moving objects databases
    Trajcevski, G
    Wolfson, O
    Zhang, FL
    Chamberlain, S
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2002, 2002, 2287 : 233 - 250
  • [10] Continuous clustering of moving objects
    Jensen, Christian S.
    Lin, Dan
    Ooi, Beng Chin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (09) : 1161 - 1174