Improved Variance Reduction Methods for Riemannian Non-Convex Optimization

被引：7

作者：

Han, Andi ^{[1
]}

Gao, Junbin ^{[1
]}

机构：

[1] Univ Sydney, Business Sch, Discipline Business Analyt, Sydney, NSW 2006, Australia

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2022年 / 44卷 / 11期

基金：

澳大利亚研究理事会;

关键词：

Complexity theory; Optimization; Manifolds; Convergence; Convex functions; Training; Principal component analysis; Riemannian optimization; non-convex optimization; online optimization; variance reduction; batch size adaptation;

D O I：

10.1109/TPAMI.2021.3112139

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Variance reduction is popular in accelerating gradient descent and stochastic gradient descent for optimization problems defined on both euclidean space and Riemannian manifold. This paper further improves on existing variance reduction methods for non-convex Riemannian optimization, including R-SVRG and R-SRG/R-SPIDER by providing a unified framework for batch size adaptation. Such framework is more general than the existing works by considering retraction and vector transport and mini-batch stochastic gradients. We show that the adaptive-batch variance reduction methods require lower gradient complexities for both general non-convex and gradient dominated functions, under both finite-sum and online optimization settings. Moreover, under the new framework, we complete the analysis of R-SVRG and R-SRG, which is currently missing in the literature. We prove convergence of R-SVRG with much simpler analysis, which leads to curvature-free complexity bounds. We also show improved results for R-SRG under double-loop convergence, which match the optimal complexities as the R-SPIDER. In addition, we prove the first online complexity results for R-SVRG and R-SRG. Lastly, we discuss the potential of adapting batch size for non-smooth, constrained and second-order Riemannian optimizers. Extensive experiments on a variety of applications support the analysis and claims in the paper.

引用

页码：7610 / 7623

页数：14

共 50 条

[41] Replica exchange for non-convex optimization
Dong, Jing
Tong, Xin T.
1600, Microtome Publishing (22):
[42] Regularized bundle methods for convex and non-convex risks
Do, Trinh-Minh-Tri
Artieres, Thierry
Journal of Machine Learning Research, 2012, 13 : 3539 - 3583
[43] Approximation methods for non-convex curves
Liu, Y
Teo, KL
Yang, XQ
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1999, 117 (01) : 125 - 135
[44] Robust Optimization for Non-Convex Objectives
Chen, Robert
Lucier, Brendan
Singer, Yaron
Syrgkanis, Vasilis
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[45] EXISTENCE THEOREMS IN NON-CONVEX OPTIMIZATION
AUBERT, G
TAHRAOUI, R
APPLICABLE ANALYSIS, 1984, 18 (1-2) : 75 - 100
[46] CLASS OF NON-CONVEX OPTIMIZATION PROBLEMS
HIRCHE, J
TAN, HK
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1977, 57 (04): : 247 - 253
[47] Accelerated algorithms for convex and non-convex optimization on manifolds
Lin, Lizhen
Saparbayeva, Bayan
Zhang, Michael Minyi
Dunson, David B.
MACHINE LEARNING, 2025, 114 (03)
[48] Convex and Non-convex Optimization Under Generalized Smoothness
Li, Haochuan
Qian, Jian
Tian, Yi
Rakhlin, Alexander
Jadbabaie, Ali
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[49] Regularized Bundle Methods for Convex and Non-Convex Risks
Trinh-Minh-Tri Do
Artieres, Thierry
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3539 - 3583
[50] Log-Sobolev inequality on non-convex Riemannian manifolds
Wang, Feng-Yu
ADVANCES IN MATHEMATICS, 2009, 222 (05) : 1503 - 1520

← 1 2 3 4 5 →