Differentially Private k-Nearest Neighbor Missing Data Imputation

被引:4
|
作者
Clifton, Chris [1 ]
Hanson, Eric J. [2 ]
Merrill, Keith [3 ]
Merrill, Shawn [1 ]
机构
[1] Purdue Univ, 305 N Univ St, W Lafayette, IN 47906 USA
[2] Univ Quebec Montreal, Lab Combinatoire & Informat Math, Montreal, PQ H3C 3P8, Canada
[3] Brandeis Univ, 415 South St, Waltham, MA 02453 USA
关键词
Differential privacy; statistical disclosure limitation; private data cleaning; smooth sensitivity;
D O I
10.1145/3507952
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Using techniques employing smooth sensitivity, we develop a method for k-nearest neighbor missing data imputation with differential privacy. This requires bounding the number of data incomplete tuples that can have their data complete "donor" changed by making a single addition or deletion to the dataset. The multiplicity of a single individual's impact on an imputed dataset necessarily means our mechanisms require the addition of more noise than mechanisms that ignore missing data, but we show empirically that this is significantly outweighed by the bias reduction from imputing missing data.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] K-Nearest Neighbor (K-NN) based Missing Data Imputation
    Murti, Della Murbarani Prawidya
    Wibawa, Aji Prasetya
    Akbar, Muhammad Iqbal
    Ianto, Utomo Puj
    2019 5TH INTERNATIONAL CONFERENCE ON SCIENCE ININFORMATION TECHNOLOGY (ICSITECH): EMBRACING INDUSTRY 4.0 - TOWARDS INNOVATION IN CYBER PHYSICAL SYSTEM, 2019, : 83 - 88
  • [2] Imputation of missing values in well log data using k-nearest neighbor collaborative filtering
    Kim, Min Jun
    Cho, Yongchae
    COMPUTERS & GEOSCIENCES, 2024, 193
  • [3] Handling Missing Strain (Rate) Curves Using K-Nearest Neighbor Imputation
    Tabassian, Mahdi
    Alessandrini, Martino
    Jasaityte, Ruta
    De Marchi, Luca
    Masetti, Guido
    D'hooge, Jan
    2016 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2016,
  • [4] A novel ranked k-nearest neighbors algorithm for missing data imputation
    Khan, Yasir
    Shah, Said Farooq
    Asim, Syed Muhammad
    JOURNAL OF APPLIED STATISTICS, 2024,
  • [5] TOBMI: trans-omics block missing data imputation using a k-nearest neighbor weighted approach
    Dong, Xuesi
    Lin, Lijuan
    Zhang, Ruyang
    Zhao, Yang
    Christiani, David C.
    Wei, Yongyue
    Chen, Feng
    BIOINFORMATICS, 2019, 35 (08) : 1278 - 1283
  • [6] Building K-nearest neighbor classifiers on vertically partitioned private data
    Zhan, J
    Chang, LW
    2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 708 - 711
  • [7] Comparative Analysis of K-Nearest Neighbor and Modified K-Nearest Neighbor Algorithm for Data Classification
    Okfalisa
    Mustakim
    Gazalba, Ikbal
    Reza, Nurul Gayatri Indah
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 294 - 298
  • [8] How distance metrics influence missing data imputation with k-nearest neighbours
    Santos, Miriam Seoane
    Abreu, Pedro Henriques
    Wilk, Szymon
    Santos, Joao
    PATTERN RECOGNITION LETTERS, 2020, 136 (136) : 111 - 119
  • [9] Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data
    Rahman, Shah Atiqur
    Huang, Yuxiao
    Claassen, Jan
    Heintzman, Nathaniel
    Kleinberg, Samantha
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : 198 - 207
  • [10] On the Use of Weighted k-Nearest Neighbors for Missing Value Imputation
    Lim, Chanhui
    Kim, Dongjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2015, 28 (01) : 23 - 31