STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK

被引：39

作者：

Agarwal, Alekh ^{[1
]}

Foster, Dean P. ^{[2
]}

Hsu, Daniel ^{[3
]}

Kakade, Sham M. ^{[3
]}

Rakhlin, Alexander ^{[2
]}

机构：

[1] Microsoft Res, New York, NY 10016 USA

[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA

[3] Microsoft Res, Cambridge, MA 02142 USA

来源：

SIAM JOURNAL ON OPTIMIZATION | 2013年 / 23卷 / 01期

基金：

美国国家科学基金会;

关键词：

derivative-free optimization; bandit optimization; ellipsoid method;

D O I：

10.1137/110850827

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x is an element of X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs (O) over tilde (poly(d)root T) regret. Since any algorithm has regret at least Omega(root T) on this problem, our algorithm is optimal in terms of the scaling with T.

引用

页码：213 / 240

页数：28

共 50 条

[41] Online convex optimization in the bandit setting: gradient descent without a gradient
Flaxman, Abraham D.
Kalai, Adam Tauman
McMahan, H. Brendan
PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2005, : 385 - 394
[42] Improved Regret Bounds for Projection-free Bandit Convex Optimization
Garber, Dan
Kretzu, Ben
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2196 - 2205
[43] A Convex Optimization Approach to Feedback Scheduling
Ben Gaid, Mongi
Simon, Daniel
Sename, Olivier
2008 MEDITERRANEAN CONFERENCE ON CONTROL AUTOMATION, VOLS 1-4, 2008, : 1210 - +
[44] Push-Sum Distributed Online Optimization With Bandit Feedback
Wang, Cong
Xu, Shengyuan
Yuan, Deming
Zhang, Baoyong
Zhang, Zhengqiang
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2263 - 2273
[45] A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback
Cayci, Semih
Zheng, Yilin
Eryilmaz, Atilla
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3716 - 3723
[46] Online Convex Optimization with Stochastic Constraints
Yu, Hao
Neely, Michael J.
Wei, Xiaohan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[47] Dual Solutions in Convex Stochastic Optimization
Pennanen, Teemu
Perkkioe, Ari-Pekka
MATHEMATICS OF OPERATIONS RESEARCH, 2024,
[48] Stochastic Approximation in Convex Multiobjective Optimization
De Bernardi, Carlo Alberto
Miglierina, Enrico
Molho, Elena
Somaglia, Jacopo
JOURNAL OF CONVEX ANALYSIS, 2024, 31 (03) : 761 - 778
[49] Dynamic Programming in Convex Stochastic Optimization
Pennanen, Teemu
Perkkioe, Ari-Pekka
JOURNAL OF CONVEX ANALYSIS, 2023, 30 (04) : 1241 - 1283
[50] Adversarial multi-armed bandit approach to stochastic optimization
Chang, Hyeong Soo
Fu, Michael C.
Marcus, Steven I.
PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5684 - +

← 1 2 3 4 5 →