Generalization Bounds for Label Noise Stochastic Gradient Descent

被引:0
|
作者
Huh, Jung Eun [1 ]
Rebeschini, Patrick [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford, England
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension d. Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with d and with the rate of n(-2/3), where n is the sample size. This rate is better than the best-known rate of n(-1/2) established for stochastic gradient Langevin dynamics (SGLD)-which employs parameter-independent Gaussian noise-under similar conditions. Our analysis offers quantitative insights into the effect of label noise.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Generalization Bounds for Stochastic Gradient Descent via Localized ε-Covers
    Park, Sejun
    Simsekli, Umut
    Erdogdu, Murat A.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
    Cao, Yuan
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
    Wang, Jiahuan
    Chen, Hong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15511 - 15519
  • [4] On the Generalization of Stochastic Gradient Descent with Momentum
    Ramezani-Kebrya, Ali
    Antonakopoulos, Kimon
    Cevher, Volkan
    Khisti, Ashish
    Liang, Ben
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
  • [5] Stability and Generalization of Decentralized Stochastic Gradient Descent
    Sun, Tao
    Li, Dongsheng
    Wang, Bao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9756 - 9764
  • [6] Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
    Pillaud-Vivien, Loucas
    Reygner, Julien
    Flammarion, Nicolas
    CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
  • [7] Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
    Haghifam, Mahdi
    Rodriguez-Galvez, Borja
    Thobaben, Ragnar
    Skoglund, Mikael
    Roy, Daniel M.
    Dziugaite, Gintare Karolina
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 663 - 706
  • [8] The effective noise of stochastic gradient descent
    Mignacco, Francesca
    Urbani, Pierfrancesco
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):
  • [9] Revisiting the Noise Model of Stochastic Gradient Descent
    Battash, Barak
    Wolf, Lior
    Lindenbaum, Ofir
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [10] Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
    Zhu, Miaoxi
    Shen, Li
    Du, Bo
    Tao, Dacheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,