Generalization Bounds for Label Noise Stochastic Gradient Descent

被引:0
|
作者
Huh, Jung Eun [1 ]
Rebeschini, Patrick [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford, England
来源
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension d. Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with d and with the rate of n(-2/3), where n is the sample size. This rate is better than the best-known rate of n(-1/2) established for stochastic gradient Langevin dynamics (SGLD)-which employs parameter-independent Gaussian noise-under similar conditions. Our analysis offers quantitative insights into the effect of label noise.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
  • [42] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [43] On the Hyperparameters in Stochastic Gradient Descent with Momentum
    Shi, Bin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [44] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [45] On the different regimes of stochastic gradient descent
    Sclocchi, Antonio
    Wyart, Matthieu
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 121 (09)
  • [46] BACKPROPAGATION AND STOCHASTIC GRADIENT DESCENT METHOD
    AMARI, S
    NEUROCOMPUTING, 1993, 5 (4-5) : 185 - 196
  • [47] Graph Drawing by Stochastic Gradient Descent
    Zheng, Jonathan X.
    Pawar, Samraat
    Goodman, Dan F. M.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (09) : 2738 - 2748
  • [48] On the discrepancy principle for stochastic gradient descent
    Jahn, Tim
    Jin, Bangti
    INVERSE PROBLEMS, 2020, 36 (09)
  • [49] Nonparametric Budgeted Stochastic Gradient Descent
    Trung Le
    Vu Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 564 - 572
  • [50] Benign Underfitting of Stochastic Gradient Descent
    Koren, Tomer
    Livni, Roi
    Mansour, Yishay
    Sherman, Uri
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,