Best-of-Both-Worlds Algorithms for Partial Monitoring

被引：0

作者：

Tsuchiya, Taira ^{[1
,2
]}

Ito, Shinji ^{[3
]}

Honda, Junya ^{[1
,2
]}

机构：

[1] Kyoto Univ, Kyoto, Japan

[2] RIKEN AIP, Tokyo, Japan

[3] NEC Corp Ltd, Tokyo, Japan

来源：

INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201 | 2023年 / 201卷

关键词：

partial monitoring; best-of-both-worlds; follow-the-regularized-leader; stochastic regime with adversarial corruptions; REGRET;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This study considers the partial monitoring problem with k-actions and d-outcomes and provides the first best-of-both-worlds algorithms, whose regrets are favorably bounded both in the stochastic and adversarial regimes. In particular, we show that for non-degenerate locally observable games, the regret is O(m(2)k(4) log(T) log(k.T)/.min) in the stochastic regime and O(mk(3/2) root T log(T) log k Pi) in the adversarial regime, where T is the number of rounds, m is the maximum number of distinct observations per action,.min is the minimum suboptimality gap, and k. is the number of Pareto optimal actions. Moreover, we show that for globally observable games, the regret is O(m(2)k(4) log(T) log(k(Pi)T)/Delta(min)) in the stochastic regime and O(mk(3/2)root Tlog(T) log(k(Pi))) in the adversarial regime, where cG is a game-dependent constant. We also provide regret bounds for a stochastic regime with adversarial corruptions. Our algorithms are based on the follow-theregularized-leader framework and are inspired by the approach of exploration by optimization and the adaptive learning rate in the field of online learning with feedback graphs.

引用

页码：1484 / 1515

页数：32

共 50 条

[1] Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
Kuroki, Yuko
Rumi, Alberto
Tsuchiya, Taira
Vitale, Fabio
Cesa-Bianchi, Nicolo
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[2] Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs
Ito, Shinji
Tsuchiya, Taira
Honda, Junya
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[3] Best-of-Both-Worlds Analysis of Online Search
Angelopoulos, Spyros
Durr, Christoph
Jin, Shendan
ALGORITHMICA, 2023, 85 (12) : 3766 - 3792
[4] Best-of-Both-Worlds Analysis of Online Search
Spyros Angelopoulos
Christoph Dürr
Shendan Jin
Algorithmica, 2023, 85 : 3766 - 3792
[5] A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
Masoudian, Saeed
Zimmert, Julian
Seldin, Yevgeny
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[6] Best-of-Both-Worlds Predictive Approach to Dissociative Chemisorption on Metals
Powell, Andrew D.
Gerrits, Nick
Tchakoua, Theophile
Somers, Mark F.
Busnengo, Heriberto F.
Meyer, Jo''rg
Kroes, Geert-Jan
Doblhoff-Dier, Katharina
JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2024, 15 (01): : 307 - 315
[7] Best-of-Both-Worlds Multiparty Quantum Computation with Publicly Verifiable Identifiable Abort
Chung, Kai-Min
Huang, Mi-Ying
Tang, Er-Cheng
Zhang, Jiapeng
ADVANCES IN CRYPTOLOGY, PT VI, EUROCRYPT 2024, 2024, 14656 : 119 - 148
[8] Best-of-Both-Worlds Multiparty Quantum Computation with Publicly Verifiable Identifiable Abort
Chung, Kai-Min
Huang, Mi-Ying
Tang, Er-Cheng
Zhang, Jiapeng
ADVANCES IN CRYPTOLOGY, PT VII, EUROCRYPT 2024, 2024, 14657 : 119 - 148
[9] Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems
Honda, Junya
Ito, Shinji
Tsuchiya, Taira
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 726 - 754
[10] A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
Rouyer, Chloe
van der Hoeven, Dirk
Cesa-Bianchi, Nicolo
Seldin, Yevgeny
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,

← 1 2 3 4 5 →