Accelerated Bregman proximal gradient methods for relatively smooth convex optimization

被引：19

作者：

Hanzely, Filip ^{[1
,2
]}

Richtarik, Peter ^{[1
,3
]}

Xiao, Lin ^{[4
]}

机构：

[1] King Abdullah Univ Sci & Technol KAUST, Div Comp Elect & Math Sci & Engn CEMSE, Thuwal, Saudi Arabia

[2] Toyota Technol Inst Chicago TTIC, Chicago, IL USA

[3] Moscow Inst Phys & Technol, Dolgoprudnyi, Russia

[4] Microsoft Res, Redmond, WA 98052 USA

来源：

COMPUTATIONAL OPTIMIZATION AND APPLICATIONS | 2021年 / 79卷 / 02期

关键词：

Convex optimization; Relative smoothness; Bregman divergence; Proximal gradient methods; Accelerated gradient methods; 1ST-ORDER METHODS; MINIMIZATION ALGORITHM; DESIGNS;

D O I：

10.1007/s10589-021-00273-8

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

We consider the problem of minimizing the sum of two convex functions: one is differentiable and relatively smooth with respect to a reference convex function, and the other can be nondifferentiable but simple to optimize. We investigate a triangle scaling property of the Bregman distance generated by the reference convex function and present accelerated Bregman proximal gradient (ABPG) methods that attain an O(k(-gamma)) convergence rate, where gamma is an element of (0, 2] is the triangle scaling exponent (TSE) of the Bregman distance. For the Euclidean distance, we have gamma = 2 and recover the convergence rate of Nesterov's accelerated gradient methods. For non-Euclidean Bregman distances, the TSE can be much smaller (say gamma <= 1), but we show that a relaxed definition of intrinsic TSE is always equal to 2. We exploit the intrinsic TSE to develop adaptive ABPG methods that converge much faster in practice. Although theoretical guarantees on a fast convergence rate seem to be out of reach in general, our methods obtain empirical O(k(-2)) rates in numerical experiments on several applications and provide posterior numerical certificates for the fast rates.

引用

页码：405 / 440

页数：36

共 50 条

[41] Inertial Block Proximal Methods For Non-Convex Non-Smooth Optimization
Le Thi Khanh Hien
Gillis, Nicolas
Patrinos, Panagiotis
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[42] ACCELERATED FIRST-ORDER METHODS FOR CONVEX OPTIMIZATION WITH LOCALLY LIPSCHITZ CONTINUOUS GRADIENT
Lu, Zhaosong
Mei, Sanyou
SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (03) : 2275 - 2310
[43] Accelerated Distributed Nesterov Gradient Descent for Convex and Smooth Functions
Qu, Guannan
Li, Na
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[44] Proximal Point Methods for Quasiconvex and Convex Functions with Bregman Distances on Hadamard Manifolds
Quiroz, E. A. Papa
Oliveira, P. Roberto
JOURNAL OF CONVEX ANALYSIS, 2009, 16 (01) : 49 - 69
[45] Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator
E. A. Vorontsova
A. V. Gasnikov
E. A. Gorbunov
P. E. Dvurechenskii
Automation and Remote Control, 2019, 80 : 1487 - 1501
[46] Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Convex Optimization Beyond Differentiability
Fan Wu
Wei Bian
Journal of Optimization Theory and Applications, 2023, 197 : 539 - 572
[47] Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator
Vorontsova, E. A.
Gasnikov, A. V.
Gorbunov, E. A.
Dvurechenskii, P. E.
AUTOMATION AND REMOTE CONTROL, 2019, 80 (08) : 1487 - 1501
[48] Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Convex Optimization Beyond Differentiability
Wu, Fan
Bian, Wei
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2023, 197 (02) : 539 - 572
[49] Momentum-based variance-reduced stochastic Bregman proximal gradient methods for nonconvex nonsmooth optimization
Liao, Shichen
Liu, Yan
Han, Congying
Guo, Tiande
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 266
[50] Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Tran, Trang H.
Scheinberg, Katya
Nguyen, Lam M.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →