Top-k Discovery Under Local Differential Privacy: An Adaptive Sampling Approach

被引:0
|
作者
Du, Rong [1 ]
Ye, Qingqing [1 ]
Fu, Yue [1 ]
Hu, Haibo [1 ]
Huang, Kai [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China
[2] Macau Univ Sci & Technol, Sch Comp Sci & Engn, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Frequency estimation; Estimation; Differential privacy; Privacy; Radio spectrum management; Data collection; Time-frequency analysis; Random variables; Protocols; Probability distribution; Local differential privacy; multi-armed bandit; top- k Estimation; DATA PUBLICATION; BOUNDS;
D O I
10.1109/TDSC.2024.3471923
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Local differential privacy (LDP) is a promising privacy model for data collection that protects sensitive information of individuals. However, applying LDP to top-k estimation in set-valued data (e.g., identifying most frequent k items) may yield poor results for small and sparse datasets due to high sensitivity and heavy perturbation. To address this, we propose an adaptive approach that frames the problem as a multi-armed bandit (MAB) problem, in which the decision-maker selects actions based on information collected from previous rounds to maximize the total reward over time. Inspired by this, we present two adaptive sampling schemes based on MAB: ARBS for identifying top-k items and ARBSF for both top-k item discovery and frequency estimation on these items. Furthermore, to address the potential long delay of multi-round collection, we propose an optimization technique to reduce the time complexity. Both theoretical and empirical results show that our adaptive sampling schemes significantly outperform existing alternatives.
引用
收藏
页码:1763 / 1780
页数:18
相关论文
共 50 条
  • [21] Top-k closed co-occurrence patterns mining with differential privacy over multiple streams
    Wang, Jinyan
    Fang, Shijian
    Liu, Chen
    Qin, Jiawen
    Li, Xianxian
    Shi, Zhenkui
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 (111): : 339 - 351
  • [22] A Novel Iterative Approach to Top-k Planning
    Katz, Michael
    Sohrabi, Shirin
    Udrea, Octavian
    Winterer, Dominik
    TWENTY-EIGHTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING (ICAPS 2018), 2018, : 132 - 140
  • [23] Top-k Based Adaptive Enumeration in Constraint Programming
    Soto, Ricardo
    Crawford, Broderick
    Palma, Wenceslao
    Monfroy, Eric
    Olivares, Rodrigo
    Castro, Carlos
    Paredes, Fernando
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [24] Adaptive Top-k Overlap Set Similarity Joins
    Yang, Zhong
    Zheng, Bolong
    Li, Guohui
    Zhao, Xi
    Zhou, Xiaofang
    Jensen, Christian S.
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1081 - 1092
  • [25] Mining top-K frequent itemsets through progressive sampling
    Andrea Pietracaprina
    Matteo Riondato
    Eli Upfal
    Fabio Vandin
    Data Mining and Knowledge Discovery, 2010, 21 : 310 - 326
  • [26] A sampling-based estimator for top-k selection query
    Chen, CM
    Ling, YB
    18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 617 - 627
  • [27] APPROXIMATE CONSISTENT WEIGHTED SAMPLING FOR EFFICIENT TOP-K SEARCH
    Kim, Yunna
    Hwang, Heasoo
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2020, 16 (03): : 1125 - 1132
  • [28] Discovery of Top-k Dense Subgraphs in Dynamic Graph Collections
    Valari, Elena
    Kontaki, Maria
    Papadopoulos, Apostolos N.
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2012, 2012, 7338 : 213 - 230
  • [29] Mining top-K frequent itemsets through progressive sampling
    Pietracaprina, Andrea
    Riondato, Matteo
    Upfal, Eli
    Vandin, Fabio
    DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (02) : 310 - 326
  • [30] Efficient Discovery of Top-K Minimal Jumping Emerging Patterns
    Terlecki, Pawel
    Walczak, Krzysztof
    ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2008, 5306 : 438 - 447