STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK

被引:39
|
作者
Agarwal, Alekh [1 ]
Foster, Dean P. [2 ]
Hsu, Daniel [3 ]
Kakade, Sham M. [3 ]
Rakhlin, Alexander [2 ]
机构
[1] Microsoft Res, New York, NY 10016 USA
[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA
[3] Microsoft Res, Cambridge, MA 02142 USA
基金
美国国家科学基金会;
关键词
derivative-free optimization; bandit optimization; ellipsoid method;
D O I
10.1137/110850827
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x is an element of X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs (O) over tilde (poly(d)root T) regret. Since any algorithm has regret at least Omega(root T) on this problem, our algorithm is optimal in terms of the scaling with T.
引用
收藏
页码:213 / 240
页数:28
相关论文
共 50 条
  • [41] Online convex optimization in the bandit setting: gradient descent without a gradient
    Flaxman, Abraham D.
    Kalai, Adam Tauman
    McMahan, H. Brendan
    PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2005, : 385 - 394
  • [42] Improved Regret Bounds for Projection-free Bandit Convex Optimization
    Garber, Dan
    Kretzu, Ben
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2196 - 2205
  • [43] A Convex Optimization Approach to Feedback Scheduling
    Ben Gaid, Mongi
    Simon, Daniel
    Sename, Olivier
    2008 MEDITERRANEAN CONFERENCE ON CONTROL AUTOMATION, VOLS 1-4, 2008, : 1210 - +
  • [44] Push-Sum Distributed Online Optimization With Bandit Feedback
    Wang, Cong
    Xu, Shengyuan
    Yuan, Deming
    Zhang, Baoyong
    Zhang, Zhengqiang
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2263 - 2273
  • [45] A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback
    Cayci, Semih
    Zheng, Yilin
    Eryilmaz, Atilla
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3716 - 3723
  • [46] Online Convex Optimization with Stochastic Constraints
    Yu, Hao
    Neely, Michael J.
    Wei, Xiaohan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [47] Dual Solutions in Convex Stochastic Optimization
    Pennanen, Teemu
    Perkkioe, Ari-Pekka
    MATHEMATICS OF OPERATIONS RESEARCH, 2024,
  • [48] Stochastic Approximation in Convex Multiobjective Optimization
    De Bernardi, Carlo Alberto
    Miglierina, Enrico
    Molho, Elena
    Somaglia, Jacopo
    JOURNAL OF CONVEX ANALYSIS, 2024, 31 (03) : 761 - 778
  • [49] Dynamic Programming in Convex Stochastic Optimization
    Pennanen, Teemu
    Perkkioe, Ari-Pekka
    JOURNAL OF CONVEX ANALYSIS, 2023, 30 (04) : 1241 - 1283
  • [50] Adversarial multi-armed bandit approach to stochastic optimization
    Chang, Hyeong Soo
    Fu, Michael C.
    Marcus, Steven I.
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5684 - +