The computational complexity of finding stationary points in non-convex optimization

被引:0
|
作者
Hollender, Alexandros [1 ]
Zampetakis, Manolis [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] Yale Univ, New Haven, CT USA
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1007/s10107-024-02139-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-convex but smooth objective functions f over unrestricted d-dimensional domains is one of the most fundamental problems in classical non-convex optimization. Nevertheless, the computational and query complexity of this problem are still not well understood when the dimension d of the problem is independent of the approximation error. In this paper, we show the following computational and query complexity results: The problem of finding approximate stationary points over unrestricted domains is PLS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsf{PLS}} $$\end{document}-complete.For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\varepsilon )$$\end{document} value queries to the objective function.We show that any algorithm needs at least Omega(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega (1/\varepsilon )$$\end{document} queries to the objective function and/or its gradient to find epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate stationary points when d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=2$$\end{document}. Combined with the above, this characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\varepsilon )$$\end{document}. For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = 2$$\end{document}, we provide a zero-order algorithm for finding epsilon\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-KKT points in constrained optimization problems that requires at most O(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\sqrt{\varepsilon })$$\end{document} value queries to the objective function. This closes the gap between works of Bubeck and Mikulincer and Vavasis and characterizes the query complexity of this problem to be Theta(1/epsilon)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Theta (1/\sqrt{\varepsilon })$$\end{document}.Combining our results with a recent result of Fearnley et al., we show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
引用
收藏
页数:61
相关论文
共 50 条
  • [1] The Computational Complexity of Finding Stationary Points in Non-Convex Optimization
    Hollender, Alexandros
    Zampetakis, Manolis
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [2] LOOKAHEAD CONVERGES TO STATIONARY POINTS OF SMOOTH NON-CONVEX FUNCTIONS
    Wang, Jianyu
    Tantia, Vinayak
    Ballas, Nicolas
    Rabbat, Michael
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8604 - 8608
  • [3] Private (Stochastic) Non-Convex Optimization Revisited: Second-Order Stationary Points and Excess Risks
    Ganesh, Arun
    Liu, Daogao
    Oh, Sewoong
    Thakurta, Abhradeep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Towards Optimal Communication Complexity in Distributed Non-Convex Optimization
    Patel, Kumar Kshitij
    Wang, Lingxiao
    Woodworth, Blake
    Bullins, Brian
    Srebro, Nati
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [5] Switched diffusion processes for non-convex optimization and saddle points search
    Journel, Lucas
    Monmarche, Pierre
    STATISTICS AND COMPUTING, 2023, 33 (06)
  • [6] Switched diffusion processes for non-convex optimization and saddle points search
    Lucas Journel
    Pierre Monmarché
    Statistics and Computing, 2023, 33
  • [7] Non-convex scenario optimization
    Garatti, Simone
    Campi, Marco C.
    MATHEMATICAL PROGRAMMING, 2025, 209 (1-2) : 557 - 608
  • [8] Non-Convex Distributed Optimization
    Tatarenko, Tatiana
    Touri, Behrouz
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3744 - 3757
  • [9] Non-Convex Optimization: A Review
    Trehan, Dhruv
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), 2020, : 418 - 423
  • [10] DUALITY IN NON-CONVEX OPTIMIZATION
    TOLAND, JF
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1978, 66 (02) : 399 - 415