On Early Stopping in Gradient Descent Learning

被引：1

作者：

Yuan Yao

Lorenzo Rosasco

Andrea Caponnetto

机构：

[1] Department of Mathematics,

[2] University of California,undefined

[3] C.B.C.L.,undefined

[4] Massachusetts Institute of Technology,undefined

[5] Bldg. E25-201,undefined

[6] 45 Carleton St.,undefined

[7] DISI,undefined

[8] Universita di Genova,undefined

[9] Via Dodecaneso 35,undefined

来源：

Constructive Approximation | 2007年 / 26卷

关键词：

Convergence Rate; Gradient Descent; Tikhonov Regularization; Reproduce Kernel Hilbert Space; Gradient Descent Method;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with Boosting, Landweber iterations, and the online learning algorithms as stochastic approximations of the gradient descent method.

引用

页码：289 / 315

页数：26

共 50 条

[21] Online gradient descent learning algorithms
Ying, Yiming
Pontil, Massimiliano
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2008, 8 (05) : 561 - 596
[22] Gradient descent learning in and out of equilibrium
Caticha, N
de Oliveira, EA
PHYSICAL REVIEW E, 2001, 63 (06): : 1 - 061905
[23] Gradient descent for general reinforcement learning
Baird, L
Moore, A
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 968 - 974
[24] Efficient learning with robust gradient descent
Matthew J. Holland
Kazushi Ikeda
Machine Learning, 2019, 108 : 1523 - 1560
[25] Learning to learn using gradient descent
Hochreiter, S
Younger, AS
Conwell, PR
ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 87 - 94
[26] Learning ReLUs via Gradient Descent
Soltanolkotabi, Mandi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[27] Efficient learning with robust gradient descent
Holland, Matthew J.
Ikeda, Kazushi
MACHINE LEARNING, 2019, 108 (8-9) : 1523 - 1560
[28] Annealed gradient descent for deep learning
Pan, Hengyue
Niu, Xin
Li, RongChun
Dou, Yong
Jiang, Hui
NEUROCOMPUTING, 2020, 380 (380) : 201 - 211
[29] Efficient Dictionary Learning with Gradient Descent
Gilboa, Dar
Buchanan, Sam
Wright, John
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[30] Orthogonal Gradient Descent for Continual Learning
Farajtabar, Mehrdad
Azizan, Navid
Mott, Alex
Li, Ang
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3762 - 3772

← 1 2 3 4 5 →