We introduce a new nonparametric framework for classification problems in the presence of missing data. The key aspect of our framework is that the regression function decomposes into an anova-type sum of orthogonal functions, of which some (or even many) may be zero. Working under a general missingness setting, which allows features to be missing not at random, our main goal is to derive the minimax rate for the excess risk in this problem. In addition to the decomposition property, the rate depends on parameters that control the tail behaviour of the marginal feature distributions, the smoothness of the regression function and a margin condition. The ambient data dimension does not appear in the minimax rate, which can therefore be faster than in the classical nonparametric setting. We further propose a sifier, based on a careful combination of a k-nearest neighbour algorithm and a thresholding step. The HAM classifier attains the minimax rate up to polylogarithmic factors and numerical experiments further illustrate its utility.
机构:
Univ Buenos Aires, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
Consejo Nacl Invest Cient & Tecn, RA-1033 Buenos Aires, DF, ArgentinaUniv Buenos Aires, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
Boente, Graciela
Gonzalez-Manteiga, Wenceslao
论文数: 0引用数: 0
h-index: 0
机构:
Univ Santiago de Compostela, Santiago, SpainUniv Buenos Aires, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
Gonzalez-Manteiga, Wenceslao
Perez-Gonzalez, Ana
论文数: 0引用数: 0
h-index: 0
机构:
Univ Vigo, Vigo, SpainUniv Buenos Aires, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
机构:
Univ Vigo, Fac Business Sci, Dept Stat & Operat Res, Orense 32004, SpainUniv Vigo, Fac Business Sci, Dept Stat & Operat Res, Orense 32004, Spain
Perez-Gonzalez, A.
Vilar-Fernandez, J. M.
论文数: 0引用数: 0
h-index: 0
机构:
Univ A Coruna, Dept Math, Fac Comp Sci, La Coruna, SpainUniv Vigo, Fac Business Sci, Dept Stat & Operat Res, Orense 32004, Spain
Vilar-Fernandez, J. M.
Gonzalez-Manteiga, W.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Santiago de Compostela, Fac Math, Dept Stat & Operat Res, Santiago De Compostela, SpainUniv Vigo, Fac Business Sci, Dept Stat & Operat Res, Orense 32004, Spain