STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK

被引：39

作者：

Agarwal, Alekh ^{[1
]}

Foster, Dean P. ^{[2
]}

Hsu, Daniel ^{[3
]}

Kakade, Sham M. ^{[3
]}

Rakhlin, Alexander ^{[2
]}

机构：

[1] Microsoft Res, New York, NY 10016 USA

[2] Univ Penn, Dept Stat, Philadelphia, PA 19104 USA

[3] Microsoft Res, Cambridge, MA 02142 USA

来源：

SIAM JOURNAL ON OPTIMIZATION | 2013年 / 23卷 / 01期

基金：

美国国家科学基金会;

关键词：

derivative-free optimization; bandit optimization; ellipsoid method;

D O I：

10.1137/110850827

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x is an element of X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs (O) over tilde (poly(d)root T) regret. Since any algorithm has regret at least Omega(root T) on this problem, our algorithm is optimal in terms of the scaling with T.

引用

页码：213 / 240

页数：28

共 50 条

[21] Bandit Convex Optimization: Towards Tight Bounds
Hazan, Elad
Levy, Kfir Y.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[22] Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback
Li, Jueyou
Gu, Chuanye
Wu, Zhiyou
Huang, Tingwen
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1009 - 1020
[23] A Second-Order Method for Stochastic Bandit Convex Optimisation
Lattimore, Tor
Gyorgy, Andras
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
[24] Pareto Front Identification from Stochastic Bandit Feedback
Auer, Peter
Chiang, Chao-Kai
Ortner, Ronald
Drugan, Madalina M.
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 939 - 947
[25] (Bandit) Convex Optimization with Biased Noisy Gradient Oracles
Hu, Xiaowei
Prashanth, L. A.
Gyorgy, Andras
Szepesvari, Csaba
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 819 - 828
[26] Bandit Convex Optimization for Scalable and Dynamic IoT Management
Chen, Tianyi
Giannakis, Georgios B.
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (01) : 1276 - 1286
[27] Kernel-based Methods for Bandit Convex Optimization
Bubeck, Sebastien
Eldan, Ronen
Lee, Yin Tat
JOURNAL OF THE ACM, 2021, 68 (04)
[28] Bandit Convex Optimization in Non-stationary Environments
Zhao, Peng
Wang, Guanghui
Zhang, Lijun
Zhou, Zhi-Hua
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1508 - 1517
[29] Bandit Convex Optimization in Non-stationary Environments
Zhao, Peng
Wang, Guanghui
Zhang, Lijun
Zhou, Zhi-Hua
JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
[30] Kernel-Based Methods for Bandit Convex Optimization
Bubeck, Sebastien
Lee, Yin Tat
Eldan, Ronen
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 72 - 85

← 1 2 3 4 5 →