The computational complexity of finding stationary points in non-convex optimization

被引:0
|
作者
Hollender, Alexandros [1 ]
Zampetakis, Manolis [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] Yale Univ, New Haven, CT USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1007/s10107-024-02139-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-convex but smooth objective functions f over unrestricted d-dimensional domains is one of the most fundamental problems in classical non-convex optimization. Nevertheless, the computational and query complexity of this problem are still not well understood when the dimension d of the problem is independent of the approximation error. In this paper, we show the following computational and query complexity results: The problem of finding approximate stationary points over unrestricted domains is PLS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsf{PLS}} $$\end{document}-complete.For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\varepsilon )$$\end{document} value queries to the objective function.We show that any algorithm needs at least Omega(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (1/\varepsilon )$$\end{document} queries to the objective function and/or its gradient to find epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points when d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=2$$\end{document}. Combined with the above, this characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\varepsilon )$$\end{document}. For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-KKT points in constrained optimization problems that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\sqrt{\varepsilon })$$\end{document} value queries to the objective function. This closes the gap between works of Bubeck and Mikulincer and Vavasis and characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\sqrt{\varepsilon })$$\end{document}.Combining our results with a recent result of Fearnley et al., we show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
引用
收藏
页数:61
相关论文
共 50 条
  • [21] EXISTENCE THEOREMS IN NON-CONVEX OPTIMIZATION
    AUBERT, G
    TAHRAOUI, R
    APPLICABLE ANALYSIS, 1984, 18 (1-2) : 75 - 100
  • [22] CLASS OF NON-CONVEX OPTIMIZATION PROBLEMS
    HIRCHE, J
    TAN, HK
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1977, 57 (04): : 247 - 253
  • [23] Accelerated algorithms for convex and non-convex optimization on manifolds
    Lin, Lizhen
    Saparbayeva, Bayan
    Zhang, Michael Minyi
    Dunson, David B.
    MACHINE LEARNING, 2025, 114 (03)
  • [24] Convex and Non-convex Optimization Under Generalized Smoothness
    Li, Haochuan
    Qian, Jian
    Tian, Yi
    Rakhlin, Alexander
    Jadbabaie, Ali
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] The Complexity of Finding Stationary Points with Stochastic Gradient Descent
    Drori, Yoel
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [27] The Complexity of Finding Stationary Points with Stochastic Gradient Descent
    Drori, Yoel
    Shamir, Ohad
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [28] POLYNOMIAL ESCAPE-TIME FROM SADDLE POINTS IN DISTRIBUTED NON-CONVEX OPTIMIZATION
    Vlaski, Stefan
    Sayed, Ali H.
    2019 IEEE 8TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2019), 2019, : 171 - 175
  • [29] FINDING ROBUST MINIMIZER FOR NON-CONVEX PHASE RETRIEVAL
    Wu, Tingting
    Huang, Chaoyan
    Gu, Xiaoyu
    Niu, Jianwei
    Zeng, Tieyong
    INVERSE PROBLEMS AND IMAGING, 2024, 18 (01) : 286 - 310
  • [30] A new accelerating method for global non-convex quadratic optimization with non-convex quadratic constraints
    Wu, Huizhuo
    Zhang, KeCun
    APPLIED MATHEMATICS AND COMPUTATION, 2008, 197 (02) : 810 - 818