The computational complexity of finding stationary points in non-convex optimization

被引:0
|
作者
Hollender, Alexandros [1 ]
Zampetakis, Manolis [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] Yale Univ, New Haven, CT USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1007/s10107-024-02139-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-convex but smooth objective functions f over unrestricted d-dimensional domains is one of the most fundamental problems in classical non-convex optimization. Nevertheless, the computational and query complexity of this problem are still not well understood when the dimension d of the problem is independent of the approximation error. In this paper, we show the following computational and query complexity results: The problem of finding approximate stationary points over unrestricted domains is PLS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsf{PLS}} $$\end{document}-complete.For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\varepsilon )$$\end{document} value queries to the objective function.We show that any algorithm needs at least Omega(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (1/\varepsilon )$$\end{document} queries to the objective function and/or its gradient to find epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points when d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=2$$\end{document}. Combined with the above, this characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\varepsilon )$$\end{document}. For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-KKT points in constrained optimization problems that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\sqrt{\varepsilon })$$\end{document} value queries to the objective function. This closes the gap between works of Bubeck and Mikulincer and Vavasis and characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\sqrt{\varepsilon })$$\end{document}.Combining our results with a recent result of Fearnley et al., we show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
引用
收藏
页数:61
相关论文
共 50 条
  • [41] On Graduated Optimization for Stochastic Non-Convex Problems
    Hazan, Elad
    Levy, Kfir Y.
    Shalev-Shwartz, Shai
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [42] Non-convex global optimization with Gurman perturbation
    Chen, S. (daisyshuoshuo@sina.com), 1600, Science Press (41):
  • [43] Non-Convex Multi-Objective Optimization
    Zhigljavsky, Anatoly
    INTERFACES, 2018, 48 (04) : 396 - 397
  • [44] THEOREM OF EXISTENCE IN NON-CONVEX OPTIMIZATION PROBLEMS
    NIFTIYEV, AA
    IZVESTIYA AKADEMII NAUK AZERBAIDZHANSKOI SSR SERIYA FIZIKO-TEKHNICHESKIKH I MATEMATICHESKIKH NAUK, 1983, 4 (05): : 115 - 119
  • [45] A Non-Convex Optimization Approach to Correlation Clustering
    Thiel, Erik
    Chehreghani, Morteza Haghir
    Dubhashi, Devdatt
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5159 - 5166
  • [46] Optimization of Non-convex Multiband Cooperative Sensing
    Khwaja, A. S.
    Naeem, M.
    Anpalagan, A.
    Venetsanopoulos, A.
    2014 27TH BIENNIAL SYMPOSIUM ON COMMUNICATIONS (QBSC), 2014, : 61 - 65
  • [47] Lower bounds for non-convex stochastic optimization
    Yossi Arjevani
    Yair Carmon
    John C. Duchi
    Dylan J. Foster
    Nathan Srebro
    Blake Woodworth
    Mathematical Programming, 2023, 199 : 165 - 214
  • [48] Variance Reduction for Faster Non-Convex Optimization
    Allen-Zhu, Zeyuan
    Hazan, Elad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [49] EXISTENCE OF SOLUTIONS FOR NON-CONVEX OPTIMIZATION PROBLEMS
    BARANGER, J
    COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES SERIE A, 1972, 274 (04): : 307 - &
  • [50] A Non-convex Optimization Model for Signal Recovery
    Chen, Changwei
    Zhou, Xiaofeng
    NEURAL PROCESSING LETTERS, 2022, 54 (05) : 3529 - 3536