Generalization Bounds for Label Noise Stochastic Gradient Descent

被引：0

作者：

Huh, Jung Eun ^{[1
]}

Rebeschini, Patrick ^{[1
]}

机构：

[1] Univ Oxford, Dept Stat, Oxford, England

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷

关键词：

STABILITY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction in Wasserstein distance of the label noise stochastic gradient flow that depends polynomially on the parameter dimension d. Using the framework of algorithmic stability, we derive time-independent generalisation error bounds for the discretized algorithm with a constant learning rate. The error bound we achieve scales polynomially with d and with the rate of n(-2/3), where n is the sample size. This rate is better than the best-known rate of n(-1/2) established for stochastic gradient Langevin dynamics (SGLD)-which employs parameter-independent Gaussian noise-under similar conditions. Our analysis offers quantitative insights into the effect of label noise.

引用

页数：26

共 50 条

[1] Generalization Bounds for Stochastic Gradient Descent via Localized ε-Covers
Park, Sejun
Simsekli, Umut
Erdogdu, Murat A.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[2] Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Cao, Yuan
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[3] Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
Wang, Jiahuan
Chen, Hong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15511 - 15519
[4] On the Generalization of Stochastic Gradient Descent with Momentum
Ramezani-Kebrya, Ali
Antonakopoulos, Kimon
Cevher, Volkan
Khisti, Ashish
Liang, Ben
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 56
[5] Stability and Generalization of Decentralized Stochastic Gradient Descent
Sun, Tao
Li, Dongsheng
Wang, Bao
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9756 - 9764
[6] Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Pillaud-Vivien, Loucas
Reygner, Julien
Flammarion, Nicolas
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
[7] Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
Haghifam, Mahdi
Rodriguez-Galvez, Borja
Thobaben, Ragnar
Skoglund, Mikael
Roy, Daniel M.
Dziugaite, Gintare Karolina
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 663 - 706
[8] The effective noise of stochastic gradient descent
Mignacco, Francesca
Urbani, Pierfrancesco
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (08):
[9] Revisiting the Noise Model of Stochastic Gradient Descent
Battash, Barak
Wolf, Lior
Lindenbaum, Ofir
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[10] Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
Zhu, Miaoxi
Shen, Li
Du, Bo
Tao, Dacheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →