MONK - Outlier-Robust Mean Embedding Estimation by Median-of-Means
被引:0
|
作者:
Lerasle, Matthieu
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris Sud, Lab Math Orsay, Paris, France
Univ Paris Saclay, CNRS, Paris, FranceUniv Paris Sud, Lab Math Orsay, Paris, France
Lerasle, Matthieu
[1
,2
]
Szabo, Zoltan
论文数: 0引用数: 0
h-index: 0
机构:
Ecole Polytech, CMAP, Palaiseau, FranceUniv Paris Sud, Lab Math Orsay, Paris, France
Szabo, Zoltan
[3
]
Mathieu, Timothee
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris Sud, Lab Math Orsay, Paris, FranceUniv Paris Sud, Lab Math Orsay, Paris, France
Mathieu, Timothee
[1
]
Lecue, Guillaume
论文数: 0引用数: 0
h-index: 0
机构:
CREST ENSAE ParisTech, Paris, FranceUniv Paris Sud, Lab Math Orsay, Paris, France
Lecue, Guillaume
[4
]
机构:
[1] Univ Paris Sud, Lab Math Orsay, Paris, France
[2] Univ Paris Saclay, CNRS, Paris, France
[3] Ecole Polytech, CMAP, Palaiseau, France
[4] CREST ENSAE ParisTech, Paris, France
来源:
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97
|
2019年
/
97卷
关键词:
KERNELS;
METRICS;
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Mean embeddings provide an extremely flexible and powerful tool in machine learning and statistics to represent probability distributions and define a semi-metric (MMD, maximum mean discrepancy; also called N-distance or energy distance), with numerous successful applications. The representation is constructed as the expectation of the feature map defined by a kernel. As a mean, its classical empirical estimator, however, can be arbitrary severely affected even by a single outlier in case of unbounded features. To the best of our knowledge, unfortunately even the consistency of the existing few techniques trying to alleviate this serious sensitivity bottleneck is unknown. In this paper, we show how the recently emerged principle of median-of-means can be used to design estimators for kernel mean embedding and MMD with excessive resistance properties to outliers, and optimal sub-Gaussian deviation bounds under mild assumptions.