On Early Stopping in Gradient Descent Learning

被引:1
|
作者
Yuan Yao
Lorenzo Rosasco
Andrea Caponnetto
机构
[1] Department of Mathematics,
[2] University of California,undefined
[3] C.B.C.L.,undefined
[4] Massachusetts Institute of Technology,undefined
[5] Bldg. E25-201,undefined
[6] 45 Carleton St.,undefined
[7] DISI,undefined
[8] Universita di Genova,undefined
[9] Via Dodecaneso 35,undefined
来源
关键词
Convergence Rate; Gradient Descent; Tikhonov Regularization; Reproduce Kernel Hilbert Space; Gradient Descent Method;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with Boosting, Landweber iterations, and the online learning algorithms as stochastic approximations of the gradient descent method.
引用
收藏
页码:289 / 315
页数:26
相关论文
共 50 条
  • [31] Fully corrective gradient boosting with squared hinge: Fast learning rates and early stopping
    Zeng, Jinshan
    Zhang, Min
    Lin, Shao-Bo
    NEURAL NETWORKS, 2022, 147 : 136 - 151
  • [32] Dual Space Gradient Descent for Online Learning
    Trung Le
    Tu Dinh Nguyen
    Vu Nguyen
    Dinh Phung
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [33] Quantum Shadow Gradient Descent for Quantum Learning
    Heidari, Mohsen
    Naved, Mobasshir A.
    Xie, Wenbo
    Grama, Arjun Jacob
    Szpankowski, Wojciech
    arXiv, 2023,
  • [34] Robust supervised learning with coordinate gradient descent
    Ibrahim Merad
    Stéphane Gaïffas
    Statistics and Computing, 2023, 33
  • [35] Online learning via congregational gradient descent
    Blackmore, RL
    Williamson, RC
    Mareels, IMY
    Sethares, WA
    MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 1997, 10 (04) : 331 - 363
  • [36] Gradient descent learning for Rotor Associative Memory
    Kitahara M.
    Kobayashi M.
    IEEJ Transactions on Electronics, Information and Systems, 2011, 131 (01) : 116 - 121+15
  • [37] On the momentum term in gradient descent learning algorithms
    Qian, N
    NEURAL NETWORKS, 1999, 12 (01) : 145 - 151
  • [38] Online learning via congregational gradient descent
    Kim L. Blackmore
    Robert C. Williamson
    Iven M. Y. Mareels
    William A. Sethares
    Mathematics of Control, Signals and Systems, 1997, 10 : 331 - 363
  • [39] Limited Gradient Descent: Learning With Noisy Labels
    Sun, Yi
    Tian, Yan
    Xu, Yiping
    Li, Jianxiang
    IEEE ACCESS, 2019, 7 : 168296 - 168306
  • [40] Natural gradient descent for on-line learning
    Rattray, M
    Saad, D
    Amari, S
    PHYSICAL REVIEW LETTERS, 1998, 81 (24) : 5461 - 5464