Combinatorial probability and the tightness of generalization bounds

被引:15
|
作者
Vorontsov K.V. [1 ]
机构
[1] Dorodnicyn Computing Centre, Russian Academy of Sciences, Moscow, 119333
基金
俄罗斯基础研究基金会;
关键词
Training Sample; Uniform Convergence; Target Function; Growth Function; Empirical Prediction;
D O I
10.1134/S1054661808020090
中图分类号
学科分类号
摘要
Accurate prediction of the generalization ability of a learning algorithm is an important problem in computational learning theory. The classical Vapnik-Chervonenkis (VC) generalization bounds are too general and therefore overestimate the expected error. Recently obtained data-dependent bounds are still overestimated. To find out why the bounds are loose, we reject the uniform convergence principle and apply a purely combinatorial approach that is free of any probabilistic assumptions, makes no approximations, and provides an empirical control of looseness. We introduce new data-dependent complexity measures: a local shatter coefficient and a nonscalar local shatter profile, which can give much tighter bounds than the classical VC shatter coefficient. An experiment on real datasets shows that the effective local measures may take very small values; thus, the effective local VC dimension takes values in [0, 1] and therefore is not related to the dimension of the space. © 2008 Pleiades Publishing, Ltd.
引用
收藏
页码:243 / 259
页数:16
相关论文
共 50 条