Generalization Bounds for Label Noise Stochastic Gradient Descent

被引：0

作者：

Huh, Jung Eun ^{[1
]}

Rebeschini, Patrick ^{[1
]}

机构：

[1] Univ Oxford, Dept Stat, Oxford, England

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷

关键词：

STABILITY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension d. Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with d and with the rate of n(-2/3), where n is the sample size. This rate is better than the best-known rate of n(-1/2) established for stochastic gradient Langevin dynamics (SGLD)-which employs parameter-independent Gaussian noise-under similar conditions. Our analysis offers quantitative insights into the effect of label noise.

引用

页数：26

共 50 条

[21] ONLINE MULTI-LABEL LEARNING WITH ACCELERATED NONSMOOTH STOCHASTIC GRADIENT DESCENT
Park, Sunho
Choi, Seungjin
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3322 - 3326
[22] Regularization in Network Optimization via Trimmed Stochastic Gradient Descent With Noisy Label
Nakamura, Kensuke
Sohn, Bong-Soo
Won, Kyoung-Jae
Hong, Byung-Woo
IEEE ACCESS, 2022, 10 : 34706 - 34715
[23] On Random Subset Generalization Error Bounds and the Stochastic Gradient Langevin Dynamics Algorithm
Rodriguez-Galvez, Borja
Bassi, German
Thobaben, Ragnar
Skoglund, Mikael
2020 IEEE INFORMATION THEORY WORKSHOP (ITW), 2021,
[24] Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks
Cao, Yuan
Gu, Quanquan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3349 - 3356
[25] Generalization performance of multi-pass stochastic gradient descent with convex loss functions
Lei, Yunwen
Hu, Ting
Tang, Ke
Journal of Machine Learning Research, 2021, 22
[26] Generalization Performance of Multi-pass Stochastic Gradient Descent with Convex Loss Functions
Lei, Yunwen
Hu, Ting
Tang, Ke
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[27] Learning General Halfspaces with Adversarial Label Noise via Online Gradient Descent
Diakonikolas, Ilias
Kontonis, Vasilis
Tzamos, Christos
Zarifis, Nikos
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[28] Unforgeability in Stochastic Gradient Descent
Baluta, Teodora
Nikolic, Ivica
Jain, Racchit
Aggarwal, Divesh
Saxena, Prateek
PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
[29] Preconditioned Stochastic Gradient Descent
Li, Xi-Lin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
[30] Stochastic Reweighted Gradient Descent
El Hanchi, Ayoub
Stephens, David A.
Maddison, Chris J.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →