Key-value data collection and statistical analysis with local differential privacy

被引:1
|
作者
Zhu, Hui [1 ]
Tang, Xiaohu [1 ]
Yang, Laurence Tianruo [2 ,3 ,4 ]
Fu, Chao [5 ]
Peng, Shuangrong [1 ]
机构
[1] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu, Peoples R China
[2] Hainan Univ, Sch Comp Sci & Technol, Haikou, Peoples R China
[3] St Francis Xavier Univ, Dept Comp Sci, Antigonish, NS, Canada
[4] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[5] Southwest Jiaotong Univ, Sch Math, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Key-value data; Local differential privacy; Mean estimation; Frequency estimation; RANGE QUERIES;
D O I
10.1016/j.ins.2023.119058
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The collection and statistical analysis of simple data types (e.g., categorical, numerical and multi-dimensional data) under local differential privacy has been widely studied. Recently, researchers have focused on the collection of the key-value data, which is one of the main types of NoSQL data model. In the collection and statistical analysis of key-value data under local differential privacy, the frequency and mean of each key must be estimated simultaneously. However, achieving a good utility-privacy tradeoff is difficult, because key-value data has inherent correlation, and some users may have different numbers of key-value pairs. In this paper, we propose an efficient sampling based scheme for collecting and analyzing key-value data. Note that the more valid data collected, the higher the accuracy of statistical data under the same disturbance level and disturbance algorithm. Therefore, we make full use of probability sampling and the inherent correlation of key-value data to improve the probability of users submitting valid key-value data. Moreover, we optimize the budget allocation on key-value data, so that the overall variance of frequency and mean estimation is close to optimal. Detailed theoretical analysis and experimental results show that the proposed scheme is superior to existing schemes in accuracy.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] CGM: An Enhanced Mechanism for Streaming Data Collection with Local Differential Privacy
    Bao, Ergute
    Yang, Yin
    Xiao, Xiaokui
    Ding, Bolin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (11): : 2258 - 2270
  • [32] Privacy-Preserving Genomic Statistical Analysis Under Local Differential Privacy
    Yamamoto, Akito
    Shibuya, Tetsuo
    DATA AND APPLICATIONS SECURITY AND PRIVACY XXXVII, DBSEC 2023, 2023, 13942 : 40 - 48
  • [33] Differential Privacy in Networked Data Collection
    Javidbakht, Omid
    Venkitasubramaniam, Mary
    2016 ANNUAL CONFERENCE ON INFORMATION SCIENCE AND SYSTEMS (CISS), 2016,
  • [34] APLDP: Adaptive personalized local differential privacy data collection in mobile crowdsensing
    Song, Haina
    Shen, Hua
    Zhao, Nan
    He, Zhangqing
    Wu, Minghu
    Xiong, Wei
    Zhang, Mingwu
    COMPUTERS & SECURITY, 2024, 136
  • [35] A Resource Allocation Controller for Key-Value Data Stores
    Kim, Young Ki
    HoseinyF, M. Reza
    Lee, Young Choon
    Zomaya, Albert Y.
    2017 IEEE 16TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2017, : 281 - 284
  • [36] Key-value caching of geospatial data for distributed GIS
    Tu, Zhenfa
    Meng, Lingkui
    Zhang, Wen
    Huang, Changqing
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2013, 38 (11): : 1339 - 1343
  • [37] Secure and Utility-Aware Data Collection with Condensed Local Differential Privacy
    Gursoy, Mehmet Emre
    Tamersoy, Acar
    Truex, Stacey
    Wei, Wenqi
    Liu, Ling
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (05) : 2365 - 2378
  • [38] Efficient Key-Value Data Placement for ZNS SSD
    Oh, Gijun
    Yang, Junseok
    Ahn, Sungyong
    APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [39] Exploiting key-value data stores scalability for HPC
    Cugnasco, Cesare
    Becerra, Yolanda
    Torres, Jordi
    Ayguade, Eduard
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW), 2017, : 85 - 94
  • [40] Statistical Quantification of Differential Privacy: A Local Approach
    Askin, Oender
    Kutta, Tim
    Dette, Holger
    43RD IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2022), 2022, : 402 - 421