Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

被引:0
|
作者
Wang, Jiachen T. [1 ]
Mittal, Prateek [1 ]
Jia, Ruoxi [2 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted K nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting a notable improvement from O(N-K), the best result from existing literature. We develop a deterministic approximation algorithm that further improves computational efficiency while maintaining the key fairness properties of the Shapley value. Through extensive experiments, we demonstrate WKNN-Shapley's computational efficiency and its superior performance in discerning data quality compared to its unweighted counterpart.
引用
收藏
页数:39
相关论文
共 50 条
  • [31] Efficient and secure k-nearest neighbor query on outsourced data
    Huijuan Lian
    Weidong Qiu
    Di Yan
    Zheng Huang
    Peng Tang
    Peer-to-Peer Networking and Applications, 2020, 13 : 2324 - 2333
  • [32] An efficient nearest neighbor search in high-dimensional data spaces
    Lee, DH
    Kim, HJ
    INFORMATION PROCESSING LETTERS, 2002, 81 (05) : 239 - 246
  • [33] Efficient and secure k-nearest neighbor query on outsourced data
    Lian, Huijuan
    Qiu, Weidong
    Yan, Di
    Huang, Zheng
    Tang, Peng
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2020, 13 (06) : 2324 - 2333
  • [34] Tandem fusion of nearest neighbor editing and condensing algorithms -: Data dimensionality effects
    Dasarathy, BV
    Sánchez, JS
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 692 - 695
  • [35] Parallel Algorithms for Constructing Range and Nearest-Neighbor Searching Data Structures
    Agarwal, Pankaj K.
    Fox, Kyle
    Munagala, Kamesh
    Nath, Abhinandan
    PODS'16: PROCEEDINGS OF THE 35TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2016, : 429 - 440
  • [36] A nearest neighbor method for efficient ICP
    Greenspan, M
    Godin, G
    THIRD INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2001, : 161 - 168
  • [37] Efficient implementation of nearest neighbor classification
    Herrero, JR
    Navarro, JJ
    Computer Recognition Systems, Proceedings, 2005, : 177 - 186
  • [38] An Efficient Pseudo Nearest Neighbor Classifier
    Chai, Zheng
    Li, Yanying
    Wang, Aili
    Li, Chen
    Zhang, Baoshuang
    Gong, Huanhuan
    Li, Yanying (liyanying2021@163.com), 2021, International Association of Engineers (48)
  • [39] Toward optimal ε-approximate nearest neighbor algorithms
    Cary, M
    JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2001, 41 (02): : 417 - 428
  • [40] Error Minimizing Algorithms for Nearest Neighbor Classifiers
    Porter, Reid B.
    Hush, Don
    Zimmer, G. Beate
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS IX, 2011, 7870