An enhanced privacy-preserving record linkage approach for multiple databases

被引:3
|
作者
Han, Shumin [1 ]
Shen, Derong [1 ]
Nie, Tiezheng [1 ]
Kou, Yue [1 ]
Yu, Ge [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110169, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Record linkage; Privacy; Bloom filter; Multi-LUs; Blocking;
D O I
10.1007/s10586-022-03590-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For the purpose of research, organizations often need to share and link data that belongs to a single individual while protecting the privacy, which is referred to as privacy preserving record linkage (PPRL). Various approaches have been developed to tackle this problem, however, it is still a challenging task due to the massive amount of data, multiple data sources, and 'dirty' data. Therefore, in this paper, an enhanced approximate multi-party PPRL (MP-PPRL) approach is proposed to improve privacy, scalability, and linkage quality. For privacy, bloom filter (BF) is a better and more efficient masking techniques than others so far. Thus, the records are encoded into BFs to ensure privacy. However, BFs may be compromised through frequency-based attacks. To enhance privacy, a distributed protocol that introduces multiple linkage units (Multi-LUs) to resist frequency-based attacks is proposed. In scalability, we develop a blocking technique based on sorted nearest neighborhood (SNN) approach for clustering similar BFs across multiple databases, called BF-SNN, which dramatically reduces complexity. In linkage quality, a personalized threshold that varies with different levels of 'dirty' data is introduced, which provides a more accurate error-tolerance for 'dirty' data and consequently improves linkage quality. An analysis and an empirical study are conducted on large real-world datasets to show the benefit of the proposed approach.
引用
收藏
页码:3641 / 3652
页数:12
相关论文
共 50 条
  • [1] An enhanced privacy-preserving record linkage approach for multiple databases
    Shumin Han
    Derong Shen
    Tiezheng Nie
    Yue Kou
    Ge Yu
    Cluster Computing, 2022, 25 : 3641 - 3652
  • [2] Accurate privacy-preserving record linkage for databases with missing values
    Vaiwsri, Sirintra
    Ranbaduge, Thilina
    Christen, Peter
    Schnell, Rainer
    INFORMATION SYSTEMS, 2022, 106
  • [3] Privacy-preserving record linkage
    Verykios, Vassilios S.
    Christen, Peter
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 321 - 332
  • [4] Privacy-Preserving Record Linkage
    Hall, Rob
    Fienberg, Stephen E.
    PRIVACY IN STATISTICAL DATABASES, 2010, 6344 : 269 - +
  • [5] Privacy-Preserving Record Linkage with Spark
    Valkering, Onno
    Belloum, Adam
    2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 440 - 448
  • [6] Privacy-Preserving Temporal Record Linkage
    Ranbaduge, Thilina
    Christen, Peter
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 377 - 386
  • [7] Privacy-Preserving Record Linkage via Bilinear Pairing Approach
    Lin, Chih-Hsun
    Yu, Chia-Mu
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-TAIWAN (ICCE-TW), 2018,
  • [8] Privacy-preserving record linkage in large databases using secure multiparty computation
    Laud, Peeter
    Pankova, Alisa
    BMC MEDICAL GENOMICS, 2018, 11
  • [9] Privacy-preserving record linkage in large databases using secure multiparty computation
    Peeter Laud
    Alisa Pankova
    BMC Medical Genomics, 11
  • [10] Privacy-Preserving Fraud Detection Across Multiple Phone Record Databases
    Henecka, Wilko
    Roughan, Matthew
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2015, 12 (06) : 640 - 651