CampER: An Effective Framework for Privacy-Aware Deep Entity Resolution

被引:2
|
作者
Guo, Yuxiang [1 ]
Chen, Lu [1 ]
Zhou, Zhengjie [2 ]
Zheng, Baihua [3 ]
Fang, Ziquan [1 ]
Zhang, Zhikun [4 ]
Mao, Yuren [2 ]
Gao, Yunjun [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Zhejiang Univ, Ningbo, Peoples R China
[3] Singapore Management Univ, Singapore, Singapore
[4] Stanford Univ, Palo Alto, CA 94304 USA
来源
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年
关键词
entity resolution; representation learning; similarity measurement; LINKAGE;
D O I
10.1145/3580305.3599266
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Entity Resolution (ER) is a fundamental problem in data preparation. Standard deep ER methods have achieved state-of-the-art effectiveness, assuming that relations from different organizations are centrally stored. However, due to privacy concerns, it can be difficult to centralize data in practice, rendering standard deep ER solutions inapplicable. Despite efforts to develop rule-based privacy-preserving ER methods, they often neglect subtle matching mechanisms and have poor effectiveness as a result. To bridge effectiveness and privacy, in this paper, we propose CampER, an effective framework for privacy-aware deep entity resolution. Specifically, we first design a training pair self-generation strategy to overcome the absence of manually labeled data in privacy-aware scenarios. Based on the self-constructed training pairs, we present a collaborative fine-tuning approach to learn the match-aware and uni-space individual tuple embeddings for accurate matching decisions. During the matching decision-making process, we first introduce a cryptographically secure approach to determine matches. Furthermore, we propose an order-preserving perturbation strategy to significantly accelerate the matching computation while guaranteeing the consistency of ER results. Extensive experiments on eight widely-used benchmark datasets demonstrate that CampER not only is comparable with the state-of-the-art standard deep ER solutions in effectiveness, but also preserves privacy.
引用
收藏
页码:626 / 637
页数:12
相关论文
共 50 条
  • [21] Towards a Privacy-Aware Quantified Self Data Management Framework
    Thuraisingham, Bhavani
    Kantarcioglu, Murat
    Bertino, Elisa
    Bakdash, Jonathan Z.
    Fernandez, Maribel
    SACMAT'18: PROCEEDINGS OF THE 23RD ACM SYMPOSIUM ON ACCESS CONTROL MODELS & TECHNOLOGIES, 2018, : 173 - 184
  • [22] A Privacy-Aware Framework for Friend Recommendations in Online Social Networks
    Alkanhal, Mona
    Samanthula, Bharath K.
    2019 22ND IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (IEEE CSE 2019) AND 17TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (IEEE EUC 2019), 2019, : 188 - 193
  • [23] Privacy-aware Online Task Assignment Framework for Mobile Crowdsensing
    Gong, Wei
    Zhang, Baoxian
    Li, Cheng
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [24] PADL: Privacy-Aware and Asynchronous Deep Learning for IoT Applications
    Liu, Xiaoyuan
    Li, Hongwei
    Xu, Guowen
    Liu, Sen
    Liu, Zhe
    Lu, Rongxing
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (08): : 6955 - 6969
  • [25] Enabling Live Video Analytics with a Scalable and Privacy-Aware Framework
    Wang, Junjue
    Amos, Brandon
    Das, Anupam
    Pillai, Padmanabhan
    Sadeh, Norman
    Satyanarayanan, Mahadev
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2018, 14 (03)
  • [26] An Efficient Privacy-Aware Split Learning Framework for Satellite Communications
    Sun, Jianfei
    Wu, Cong
    Mumtaz, Shahid
    Tao, Junyi
    Cao, Mingsheng
    Wang, Mei
    Frascolla, Valerio
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2024, 42 (12) : 3355 - 3365
  • [27] Privacy-Aware Process Performance Indicators: Framework and Release Mechanisms
    Kabierski, Martin
    Fahrenkrog-Petersen, Stephan A.
    Weidlich, Matthias
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2021), 2021, 12751 : 19 - 36
  • [28] Privacy-Aware Identity Cloning Detection Based on Deep Forest
    Alharbi, Ahmed
    Dong, Hai
    Yi, Xun
    Abeysekara, Prabath
    SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 415 - 430
  • [29] Privacy-Aware Kalman Filtering
    Song, Yang
    Wang, Chong Xiao
    Tay, Wee Peng
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4434 - 4438
  • [30] Privacy-Aware QoE Evaluation
    Zhou, Liang
    Wei, Xin
    Cui, Jingwu
    Zheng, Baoyu
    2017 IEEE 85TH VEHICULAR TECHNOLOGY CONFERENCE (VTC SPRING), 2017,