On Early Stopping in Gradient Descent Learning

被引:1
|
作者
Yuan Yao
Lorenzo Rosasco
Andrea Caponnetto
机构
[1] Department of Mathematics,
[2] University of California,undefined
[3] C.B.C.L.,undefined
[4] Massachusetts Institute of Technology,undefined
[5] Bldg. E25-201,undefined
[6] 45 Carleton St.,undefined
[7] DISI,undefined
[8] Universita di Genova,undefined
[9] Via Dodecaneso 35,undefined
来源
关键词
Convergence Rate; Gradient Descent; Tikhonov Regularization; Reproduce Kernel Hilbert Space; Gradient Descent Method;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with Boosting, Landweber iterations, and the online learning algorithms as stochastic approximations of the gradient descent method.
引用
收藏
页码:289 / 315
页数:26
相关论文
共 50 条
  • [21] Online gradient descent learning algorithms
    Ying, Yiming
    Pontil, Massimiliano
    FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2008, 8 (05) : 561 - 596
  • [22] Gradient descent learning in and out of equilibrium
    Caticha, N
    de Oliveira, EA
    PHYSICAL REVIEW E, 2001, 63 (06): : 1 - 061905
  • [23] Gradient descent for general reinforcement learning
    Baird, L
    Moore, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 968 - 974
  • [24] Efficient learning with robust gradient descent
    Matthew J. Holland
    Kazushi Ikeda
    Machine Learning, 2019, 108 : 1523 - 1560
  • [25] Learning to learn using gradient descent
    Hochreiter, S
    Younger, AS
    Conwell, PR
    ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 87 - 94
  • [26] Learning ReLUs via Gradient Descent
    Soltanolkotabi, Mandi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [27] Efficient learning with robust gradient descent
    Holland, Matthew J.
    Ikeda, Kazushi
    MACHINE LEARNING, 2019, 108 (8-9) : 1523 - 1560
  • [28] Annealed gradient descent for deep learning
    Pan, Hengyue
    Niu, Xin
    Li, RongChun
    Dou, Yong
    Jiang, Hui
    NEUROCOMPUTING, 2020, 380 (380) : 201 - 211
  • [29] Efficient Dictionary Learning with Gradient Descent
    Gilboa, Dar
    Buchanan, Sam
    Wright, John
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [30] Orthogonal Gradient Descent for Continual Learning
    Farajtabar, Mehrdad
    Azizan, Navid
    Mott, Alex
    Li, Ang
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3762 - 3772