Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization

被引：0

作者：

Ahmed Khaled

Othmane Sebbouh

Nicolas Loizou

Robert M. Gower

Peter Richtárik

机构：

[1] Princeton University,ENS Paris

[2] CREST-ENSAE,undefined

[3] Johns Hopkins University,undefined

[4] Flatiron Institute,undefined

[5] KAUST,undefined

来源：

Journal of Optimization Theory and Applications | 2023年 / 199卷 / 2期

关键词：

Stochastic optimization; Convex optimization; Variance reduction; Composite optimization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a unified theorem for the convergence analysis of stochastic gradient algorithms for minimizing a smooth and convex loss plus a convex regularizer. We do this by extending the unified analysis of Gorbunov et al. (in: AISTATS, 2020) and dropping the requirement that the loss function be strongly convex. Instead, we rely only on convexity of the loss function. Our unified analysis applies to a host of existing algorithms such as proximal SGD, variance reduced methods, quantization and some coordinate descent-type methods. For the variance reduced methods, we recover the best known convergence rates as special cases. For proximal SGD, the quantization and coordinate-type methods, we uncover new state-of-the-art convergence rates. Our analysis also includes any form of sampling or minibatching. As such, we are able to determine the minibatch size that optimizes the total complexity of variance reduced methods. We showcase this by obtaining a simple formula for the optimal minibatch size of two variance reduced methods (L-SVRG and SAGA). This optimal minibatch size not only improves the theoretical total complexity of the methods but also improves their convergence in practice, as we show in several experiments.

引用

页码：499 / 540

页数：41

共 50 条

[21] Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization
Necoara, Ion
Chorobura, Flavia
MATHEMATICS OF OPERATIONS RESEARCH, 2024,
[22] Relatively accelerated stochastic gradient algorithm for a class of non-smooth convex optimization problem
Zhang, Wenjuan
Feng, Xiangchu
Xiao, Feng
Huang, Shujuan
Li, Huan
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (03): : 147 - 157
[23] Inexact Proximal Gradient Methods for Non-Convex and Non-Smooth Optimization
Gu, Bin
Wang, De
Huo, Zhouyuan
Huang, Heng
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3093 - 3100
[24] Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization
Gao, Juan
Liu, Xin-Wei
Dai, Yu-Hong
Huang, Yakui
Gu, Junhua
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2023, 84 (02) : 531 - 572
[25] Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization
Juan Gao
Xin-Wei Liu
Yu-Hong Dai
Yakui Huang
Junhua Gu
Computational Optimization and Applications, 2023, 84 : 531 - 572
[26] ADDITIVE SCHWARZ METHODS FOR CONVEX OPTIMIZATION AS GRADIENT METHODS
Park, Jongho
SIAM JOURNAL ON NUMERICAL ANALYSIS, 2020, 58 (03) : 1495 - 1530
[27] Convergence Rate Analysis of Distributed Gradient Methods for Smooth Optimization
Jakovetic, Dusan
Xavier, Joao
Moura, Jose M. F.
2012 20TH TELECOMMUNICATIONS FORUM (TELFOR), 2012, : 867 - 870
[28] Optimal Tensor Methods in Smooth Convex and Uniformly Convex Optimization
Gasnikov, Alexander
Dvurechensky, Pavel
Gorbunov, Eduard
Vorontsova, Evgeniya
Selikhanovych, Daniil
Uribe, Cesar A.
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[29] A unified analysis of stochastic gradient-free Frank-Wolfe methods
Guo, Jiahong
Liu, Huiling
Xiao, Xiantao
INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH, 2022, 29 (01) : 63 - 86
[30] CONTRACTING PROXIMAL METHODS FOR SMOOTH CONVEX OPTIMIZATION
Doikov, Nikita
Nesterov, Yurii
SIAM JOURNAL ON OPTIMIZATION, 2020, 30 (04) : 3146 - 3169

← 1 2 3 4 5 →