Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds

被引:15
|
作者
Zhou, Pan [1 ]
Yuan, Xiao-Tong [2 ]
Yan, Shuicheng [1 ]
Feng, Jiashi [1 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] Nanjing Univ Informat Sci & Technol, Sch Automat, Nanjing 210044, Peoples R China
关键词
Optimization; Complexity theory; Manifolds; Convergence; Signal processing algorithms; Stochastic processes; Minimization; Riemannian optimization; stochastic variance-reduced algorithm; non-convex optimization; online learning; ILLUMINATION; COMPLETION;
D O I
10.1109/TPAMI.2019.2933841
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
First-order non-convex Riemannian optimization algorithms have gained recent popularity in structured machine learning problems including principal component analysis and low-rank matrix completion. The current paper presents an efficient Riemannian Stochastic Path Integrated Differential EstimatoR (R-SPIDER) algorithm to solve the finite-sum and online Riemannian non-convex minimization problems. At the core of R-SPIDER is a recursive semi-stochastic gradient estimator that can accurately estimate Riemannian gradient under not only exponential mapping and parallel transport, but also general retraction and vector transport operations. Compared with prior Riemannian algorithms, such a recursive gradient estimation mechanism endows R-SPIDER with lower computational cost in first-order oracle complexity. Specifically, for finite-sum problems with n components, R-SPIDER is proved to converge to an epsilon-approximate stationary point within O(min(n + root n/epsilon(2),1/epsilon(3))) stochastic gradient evaluations, beating the best-known complexity O(n+1/epsilon(4)); for online optimization, R-SPIDER is shown to converge with O(1/epsilon(3)) complexity which is, to the best of our knowledge, the first non-asymptotic result for online Riemannian optimization. For the special case of gradient dominated functions, we further develop a variant of R-SPIDER with improved linear rate of convergence. Extensive experimental results demonstrate the advantage of the proposed algorithms over the state-of-the-art Riemannian non-convex optimization methods.
引用
收藏
页码:459 / 472
页数:14
相关论文
共 50 条
  • [41] Stochastic proximal methods for non-smooth non-convex constrained sparse optimization
    Metel, Michael R.
    Takeda, Akiko
    Journal of Machine Learning Research, 2021, 22
  • [42] Stochastic proximal quasi-Newton methods for non-convex composite optimization
    Wang, Xiaoyu
    Wang, Xiao
    Yuan, Ya-xiang
    OPTIMIZATION METHODS & SOFTWARE, 2019, 34 (05): : 922 - 948
  • [43] Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization
    Gao, Juan
    Liu, Xin-Wei
    Dai, Yu-Hong
    Huang, Yakui
    Gu, Junhua
    COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2023, 84 (02) : 531 - 572
  • [44] Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization
    Juan Gao
    Xin-Wei Liu
    Yu-Hong Dai
    Yakui Huang
    Junhua Gu
    Computational Optimization and Applications, 2023, 84 : 531 - 572
  • [45] MASAGA: A Linearly-Convergent Stochastic First-Order Method for Optimization on Manifolds
    Babanezhad, Reza
    Laradji, Issam H.
    Shafaei, Alireza
    Schmidt, Mark
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 344 - 359
  • [46] Relaxation in non-convex optimal control problems described by first-order evolution equations
    Tolstonogov, AA
    SBORNIK MATHEMATICS, 1999, 190 (11-12) : 1689 - 1714
  • [47] Natasha 2: Faster Non-Convex Optimization Than SGD
    Allen-Zhu, Zeyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [48] Stochastic Network Optimization with Non-Convex Utilities and Costs
    Neely, Michael J.
    2010 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2010, : 352 - 361
  • [49] Decentralized Gradient-Free Methods for Stochastic Non-smooth Non-convex Optimization
    Lin, Zhenwei
    Xia, Jingfan
    Deng, Qi
    Luo, Luo
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17477 - 17486
  • [50] A STOCHASTIC APPROACH TO THE CONVEX OPTIMIZATION OF NON-CONVEX DISCRETE ENERGY SYSTEMS
    Burger, Eric M.
    Moura, Scott J.
    PROCEEDINGS OF THE ASME 10TH ANNUAL DYNAMIC SYSTEMS AND CONTROL CONFERENCE, 2017, VOL 3, 2017,