Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [31] Global Convergence and Stability of Stochastic Gradient Descent
    Patel, Vivak
    Zhang, Shushu
    Tian, Bowen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [32] Gradient Descent Analysis: On Visualizing the Training of Deep Neural Networks
    Becker, Martin
    Lippel, Jens
    Zielke, Thomas
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL 3: IVAPP, 2019, : 338 - 345
  • [33] Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
    Lei, Yunwen
    Ying, Yiming
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [34] Implicit Bias of (Stochastic) Gradient Descent for Rank-1 Linear Neural Network
    Lyu, Bochen
    Zhu, Zhanxing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [35] Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data
    Frei, Spencer
    Chatterji, Niladri S.
    Bartlett, Peter L.
    CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
  • [36] Parameter calibration with stochastic gradient descent for interacting particle systems driven by neural networks
    Goettlich, Simone
    Totzeck, Claudia
    MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 2022, 34 (01) : 185 - 214
  • [37] Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
    An, Jing
    Lu, Jianfeng
    arXiv, 2023,
  • [38] Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks
    熊霞
    陈永聪
    石春晓
    敖平
    Chinese Physics Letters, 2023, 40 (08) : 11 - 24
  • [39] Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks
    熊霞
    陈永聪
    石春晓
    敖平
    Chinese Physics Letters, 2023, (08) : 11 - 24
  • [40] Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks
    Xiong, Xia
    Chen, Yong-Cong
    Shi, Chunxiao
    Ao, Ping
    CHINESE PHYSICS LETTERS, 2023, 40 (08)