Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [1] Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
    Zhuo, Li'an
    Zhang, Baochang
    Chen, Chen
    Ye, Qixiang
    Liu, Jianzhuang
    Doermann, David
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9348 - 9355
  • [2] Convergence of gradient descent for learning linear neural networks
    Nguegnang, Gabin Maxime
    Rauhut, Holger
    Terstiege, Ulrich
    ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2024, 2024 (01):
  • [3] Is Learning in Biological Neural Networks Based on Stochastic Gradient Descent? An Analysis Using Stochastic Processes
    Christensen, Soeren
    Kallsen, Jan
    NEURAL COMPUTATION, 2024, 36 (07) : 1424 - 1432
  • [4] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204
  • [5] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
  • [6] Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
    Cui, Xiaodong
    Zhang, Wei
    Tuske, Zoltan
    Picheny, Michael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [7] A GEOMETRIC APPROACH OF GRADIENT DESCENT ALGORITHMS IN LINEAR NEURAL NETWORKS
    Chitour, Yacine
    Liao, Zhenyu
    Couillet, Romain
    MATHEMATICAL CONTROL AND RELATED FIELDS, 2023, 13 (03) : 918 - 945
  • [8] Optimizing Deep Neural Networks Through Neuroevolution With Stochastic Gradient Descent
    Zhang, Haichao
    Hao, Kuangrong
    Gao, Lei
    Wei, Bing
    Tang, Xuesong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (01) : 111 - 121
  • [9] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [10] Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent
    Whiting, Wes
    Wang, Bao
    Xin, Jack
    COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1175 - 1188