Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [41] Parameter calibration with stochastic gradient descent for interacting particle systems driven by neural networks
    Simone Göttlich
    Claudia Totzeck
    Mathematics of Control, Signals, and Systems, 2022, 34 : 185 - 214
  • [42] Stochastic Gradient Descent for Linear Systems with Missing Data
    Ma, Anna
    Needell, Deanna
    NUMERICAL MATHEMATICS-THEORY METHODS AND APPLICATIONS, 2019, 12 (01) : 1 - 20
  • [43] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
  • [44] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
  • [45] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun ZHOU
    Cong-ying HAN
    Tian-de GUO
    ActaMathematicaeApplicataeSinica, 2021, 37 (01) : 126 - 136
  • [46] Data-Dependent Stability of Stochastic Gradient Descent
    Kuzborskij, Ilja
    Lampert, Christoph H.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [47] The Dynamics of Gradient Descent for Overparametrized Neural Networks
    Satpathi, Siddhartha
    Srikant, R.
    LEARNING FOR DYNAMICS AND CONTROL, VOL 144, 2021, 144
  • [48] Applying Gradient Descent in Convolutional Neural Networks
    Cui, Nan
    2ND INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2018), 2018, 1004
  • [49] Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
    Bassily, Raef
    Feldman, Vitaly
    Guzman, Cristobal
    Talwar, Kunal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [50] ANALYSIS OF GRADIENT DESCENT LEARNING ALGORITHMS FOR MULTILAYER FEEDFORWARD NEURAL NETWORKS
    GUO, H
    GELFAND, SB
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1991, 38 (08): : 883 - 894