Differentially Private k-Nearest Neighbor Missing Data Imputation

被引：4

作者：

Clifton, Chris ^{[1
]}

Hanson, Eric J. ^{[2
]}

Merrill, Keith ^{[3
]}

Merrill, Shawn ^{[1
]}

机构：

[1] Purdue Univ, 305 N Univ St, W Lafayette, IN 47906 USA

[2] Univ Quebec Montreal, Lab Combinatoire & Informat Math, Montreal, PQ H3C 3P8, Canada

[3] Brandeis Univ, 415 South St, Waltham, MA 02453 USA

来源：

ACM TRANSACTIONS ON PRIVACY AND SECURITY | 2022年 / 25卷 / 03期

关键词：

Differential privacy; statistical disclosure limitation; private data cleaning; smooth sensitivity;

D O I：

10.1145/3507952

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Using techniques employing smooth sensitivity, we develop a method for k-nearest neighbor missing data imputation with differential privacy. This requires bounding the number of data incomplete tuples that can have their data complete "donor" changed by making a single addition or deletion to the dataset. The multiplicity of a single individual's impact on an imputed dataset necessarily means our mechanisms require the addition of more noise than mechanisms that ignore missing data, but we show empirically that this is significantly outweighed by the bias reduction from imputing missing data.

引用

页数：23

共 50 条

[21] Differentially private nearest neighbor classification
Gursoy, Mehmet Emre
Inan, Ali
Nergiz, Mehmet Ercan
Saygin, Yucel
DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (05) : 1544 - 1575
[22] Benchmarking k-nearest neighbour imputation with homogeneous Likert data
Jonsson, Per
Wohlin, Claes
EMPIRICAL SOFTWARE ENGINEERING, 2006, 11 (03) : 463 - 489
[23] An evaluation of k-nearest neighbour imputation using Likert data
Jönsson, P
Wohlin, C
10TH INTERNATIONAL SYMPOSIUM ON SOFTWARE METRICS, PROCEEDINGS, 2004, : 108 - 118
[24] Grey Relational Analysis based k Nearest Neighbor Missing Data Imputation for Software Quality Datasets
Huang, Jianglin
Sun, Hongyi
2016 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2016), 2016, : 86 - 91
[25] Benchmarking k-nearest neighbour imputation with homogeneous Likert data
Per Jönsson
Claes Wohlin
Empirical Software Engineering, 2006, 11
[26] Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods
Saeipourdizaj, Parisa
Sarbakhsh, Parvin
Gholampour, Akbar
ENVIRONMENTAL HEALTH ENGINEERING AND MANAGEMENT JOURNAL, 2021, 8 (03): : 215 - 226
[27] k-nearest neighbor imputation method and its application in fault diagnosis of industrial process
Li, Yuan
Wu, Jie
Wang, Guo-Zhu
Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2015, 49 (06): : 830 - 836
[28] Scalable Evidential K-Nearest Neighbor Classification on Big Data
Gong, Chaoyu
Demmel, Jim
You, Yang
IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (03) : 226 - 237
[29] MKNN: Modified K-Nearest Neighbor
Parvin, Hamid
Alizadeh, Hoscin
Minael-Bidgoli, Behrouz
WCECS 2008: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, 2008, : 831 - 834
[30] A GENERALIZED K-NEAREST NEIGHBOR RULE
PATRICK, EA
FISCHER, FP
INFORMATION AND CONTROL, 1970, 16 (02): : 128 - &

← 1 2 3 4 5 →