STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK

被引:39
|
作者
Agarwal, Alekh [1 ]
Foster, Dean P. [2 ]
Hsu, Daniel [3 ]
Kakade, Sham M. [3 ]
Rakhlin, Alexander [2 ]
机构
[1] Microsoft Res, New York, NY 10016 USA
[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA
[3] Microsoft Res, Cambridge, MA 02142 USA
基金
美国国家科学基金会;
关键词
derivative-free optimization; bandit optimization; ellipsoid method;
D O I
10.1137/110850827
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x is an element of X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs (O) over tilde (poly(d)root T) regret. Since any algorithm has regret at least Omega(root T) on this problem, our algorithm is optimal in terms of the scaling with T.
引用
收藏
页码:213 / 240
页数:28
相关论文
共 50 条
  • [21] Bandit Convex Optimization: Towards Tight Bounds
    Hazan, Elad
    Levy, Kfir Y.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [22] Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback
    Li, Jueyou
    Gu, Chuanye
    Wu, Zhiyou
    Huang, Tingwen
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1009 - 1020
  • [23] A Second-Order Method for Stochastic Bandit Convex Optimisation
    Lattimore, Tor
    Gyorgy, Andras
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [24] Pareto Front Identification from Stochastic Bandit Feedback
    Auer, Peter
    Chiang, Chao-Kai
    Ortner, Ronald
    Drugan, Madalina M.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 939 - 947
  • [25] (Bandit) Convex Optimization with Biased Noisy Gradient Oracles
    Hu, Xiaowei
    Prashanth, L. A.
    Gyorgy, Andras
    Szepesvari, Csaba
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 819 - 828
  • [26] Bandit Convex Optimization for Scalable and Dynamic IoT Management
    Chen, Tianyi
    Giannakis, Georgios B.
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (01) : 1276 - 1286
  • [27] Kernel-based Methods for Bandit Convex Optimization
    Bubeck, Sebastien
    Eldan, Ronen
    Lee, Yin Tat
    JOURNAL OF THE ACM, 2021, 68 (04)
  • [28] Bandit Convex Optimization in Non-stationary Environments
    Zhao, Peng
    Wang, Guanghui
    Zhang, Lijun
    Zhou, Zhi-Hua
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1508 - 1517
  • [29] Bandit Convex Optimization in Non-stationary Environments
    Zhao, Peng
    Wang, Guanghui
    Zhang, Lijun
    Zhou, Zhi-Hua
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [30] Kernel-Based Methods for Bandit Convex Optimization
    Bubeck, Sebastien
    Lee, Yin Tat
    Eldan, Ronen
    STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 72 - 85