NONPARAMETRIC CLASSIFICATION WITH MISSING DATA

被引:0
|
作者
Sell, Torben [1 ,2 ]
Berrett, Thomas b. [3 ]
Cannings, Timothy i. [1 ,2 ]
机构
[1] Univ Edinburgh, Sch Math, Edinburgh, Scotland
[2] Univ Edinburgh, Maxwell Inst Math Sci, Edinburgh, Scotland
[3] Univ Warwick, Dept Stat, Coventry, England
来源
ANNALS OF STATISTICS | 2024年 / 52卷 / 03期
基金
英国工程与自然科学研究理事会;
关键词
Missing data; classification; minimax; MINIMAX RATE; DISCRIMINATION;
D O I
10.1214/24-AOS2389
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We introduce a new nonparametric framework for classification problems in the presence of missing data. The key aspect of our framework is that the regression function decomposes into an anova-type sum of orthogonal functions, of which some (or even many) may be zero. Working under a general missingness setting, which allows features to be missing not at random, our main goal is to derive the minimax rate for the excess risk in this problem. In addition to the decomposition property, the rate depends on parameters that control the tail behaviour of the marginal feature distributions, the smoothness of the regression function and a margin condition. The ambient data dimension does not appear in the minimax rate, which can therefore be faster than in the classical nonparametric setting. We further propose a sifier, based on a careful combination of a k-nearest neighbour algorithm and a thresholding step. The HAM classifier attains the minimax rate up to polylogarithmic factors and numerical experiments further illustrate its utility.
引用
收藏
页码:1178 / 1200
页数:23
相关论文
共 50 条
  • [41] Classification of data with missing elements and outliers
    Stanimirova, I.
    Walczak, B.
    TALANTA, 2008, 76 (03) : 602 - 609
  • [42] Optimal classification and nonparametric regression for functional data
    Meister, Alexander
    BERNOULLI, 2016, 22 (03) : 1729 - 1744
  • [43] Fast nonparametric classification based on data depth
    Tatjana Lange
    Karl Mosler
    Pavlo Mozharovskyi
    Statistical Papers, 2014, 55 : 49 - 69
  • [44] NONPARAMETRIC BAYESIAN SUPERVISED CLASSIFICATION OF FUNCTIONAL DATA
    Rabaoui, Asma
    Kadri, Hachem
    Davy, Manuel
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 3381 - 3384
  • [45] Pattern classification with missing data: a review
    Pedro J. García-Laencina
    José-Luis Sancho-Gómez
    Aníbal R. Figueiras-Vidal
    Neural Computing and Applications, 2010, 19 : 263 - 282
  • [46] Classification with Low Rank and Missing Data
    Hazan, Elad
    Livni, Roi
    Mansour, Yishay
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 257 - 266
  • [47] Pattern classification with missing data: a review
    Garcia-Laencina, Pedro J.
    Sancho-Gomez, Jose-Luis
    Figueiras-Vidal, Anibal R.
    NEURAL COMPUTING & APPLICATIONS, 2010, 19 (02): : 263 - 282
  • [48] Nonparametric criteria for supervised classification of fuzzy data
    Colubi, Ana
    Gonzalez-Rodriguez, Gil
    Angeles Gil, M.
    Trutschnig, Wolfgang
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2011, 52 (09) : 1272 - 1282
  • [49] Nearest Subspace Classification with Missing Data
    Chi, Yuejie
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 1667 - 1671
  • [50] A nonparametric inverse probability weighted estimation for functional data with missing response data at random
    Wang, Longbing
    Cao, Ruiyuan
    Du, Jiang
    Zhang, Zhongzhan
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2019, 48 (04) : 537 - 546