Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引：6

作者：

Paquin, Alexandre Lemire ^{[1
]}

Chaib-draa, Brahim ^{[1
]}

Giguere, Philippe ^{[1
]}

机构：

[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada

来源：

NEURAL NETWORKS | 2023年 / 164卷

关键词：

Generalization; Deep learning; Stochastic gradient descent; Stability;

D O I：

10.1016/j.neunet.2023.04.028

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页码：382 / 394

页数：13

共 50 条

[1] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
Zhuo, Li'an
Zhang, Baochang
Chen, Chen
Ye, Qixiang
Liu, Jianzhuang
Doermann, David
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
[2] Convergence of gradient descent for learning linear neural networks
Nguegnang, Gabin Maxime
Rauhut, Holger
Terstiege, Ulrich
ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2024, 2024 (01):
[3] Is Learning in Biological Neural Networks Based on Stochastic Gradient Descent? An Analysis Using Stochastic Processes
Christensen, Soeren
Kallsen, Jan
NEURAL COMPUTATION, 2024, 36 (07) : 1424 - 1432
[4] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204
[5] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
[6] Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Cui, Xiaodong
Zhang, Wei
Tuske, Zoltan
Picheny, Michael
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[7] A GEOMETRIC APPROACH OF GRADIENT DESCENT ALGORITHMS IN LINEAR NEURAL NETWORKS
Chitour, Yacine
Liao, Zhenyu
Couillet, Romain
MATHEMATICAL CONTROL AND RELATED FIELDS, 2023, 13 (03) : 918 - 945
[8] Optimizing Deep Neural Networks Through Neuroevolution With Stochastic Gradient Descent
Zhang, Haichao
Hao, Kuangrong
Gao, Lei
Wei, Bing
Tang, Xuesong
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (01) : 111 - 121
[9] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Cao, Yuan
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[10] Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
Whiting, Wes
Wang, Bao
Xin, Jack
COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1175 - 1188

← 1 2 3 4 5 →