Generalization Bounds for Label Noise Stochastic Gradient Descent

被引:0
|
作者
Huh, Jung Eun [1 ]
Rebeschini, Patrick [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford, England
来源
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension d. Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with d and with the rate of n(-2/3), where n is the sample size. This rate is better than the best-known rate of n(-1/2) established for stochastic gradient Langevin dynamics (SGLD)-which employs parameter-independent Gaussian noise-under similar conditions. Our analysis offers quantitative insights into the effect of label noise.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] ONLINE MULTI-LABEL LEARNING WITH ACCELERATED NONSMOOTH STOCHASTIC GRADIENT DESCENT
    Park, Sunho
    Choi, Seungjin
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3322 - 3326
  • [22] Regularization in Network Optimization via Trimmed Stochastic Gradient Descent With Noisy Label
    Nakamura, Kensuke
    Sohn, Bong-Soo
    Won, Kyoung-Jae
    Hong, Byung-Woo
    IEEE ACCESS, 2022, 10 : 34706 - 34715
  • [23] On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm
    Rodriguez-Galvez, Borja
    Bassi, German
    Thobaben, Ragnar
    Skoglund, Mikael
    2020 IEEE INFORMATION THEORY WORKSHOP (ITW), 2021,
  • [24] Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks
    Cao, Yuan
    Gu, Quanquan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3349 - 3356
  • [25] Generalization performance of multi-pass stochastic gradient descent with convex loss functions
    Lei, Yunwen
    Hu, Ting
    Tang, Ke
    Journal of Machine Learning Research, 2021, 22
  • [26] Generalization Performance of Multi-pass Stochastic Gradient Descent with Convex Loss Functions
    Lei, Yunwen
    Hu, Ting
    Tang, Ke
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [27] Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent
    Diakonikolas, Ilias
    Kontonis, Vasilis
    Tzamos, Christos
    Zarifis, Nikos
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [28] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [29] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [30] Stochastic Reweighted Gradient Descent
    El Hanchi, Ayoub
    Stephens, David A.
    Maddison, Chris J.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,