Best-of-Both-Worlds Algorithms for Partial Monitoring

被引:0
|
作者
Tsuchiya, Taira [1 ,2 ]
Ito, Shinji [3 ]
Honda, Junya [1 ,2 ]
机构
[1] Kyoto Univ, Kyoto, Japan
[2] RIKEN AIP, Tokyo, Japan
[3] NEC Corp Ltd, Tokyo, Japan
关键词
partial monitoring; best-of-both-worlds; follow-the-regularized-leader; stochastic regime with adversarial corruptions; REGRET;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study considers the partial monitoring problem with k-actions and d-outcomes and provides the first best-of-both-worlds algorithms, whose regrets are favorably bounded both in the stochastic and adversarial regimes. In particular, we show that for non-degenerate locally observable games, the regret is O(m(2)k(4) log(T) log(k.T)/.min) in the stochastic regime and O(mk(3/2) root T log(T) log k Pi) in the adversarial regime, where T is the number of rounds, m is the maximum number of distinct observations per action,.min is the minimum suboptimality gap, and k. is the number of Pareto optimal actions. Moreover, we show that for globally observable games, the regret is O(m(2)k(4) log(T) log(k(Pi)T)/Delta(min)) in the stochastic regime and O(mk(3/2)root Tlog(T) log(k(Pi))) in the adversarial regime, where cG is a game-dependent constant. We also provide regret bounds for a stochastic regime with adversarial corruptions. Our algorithms are based on the follow-theregularized-leader framework and are inspired by the approach of exploration by optimization and the adaptive learning rate in the field of online learning with feedback graphs.
引用
收藏
页码:1484 / 1515
页数:32
相关论文
共 50 条
  • [1] Best-of-Both-Worlds Algorithms for Linear Contextual Bandits
    Kuroki, Yuko
    Rumi, Alberto
    Tsuchiya, Taira
    Vitale, Fabio
    Cesa-Bianchi, Nicolo
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [2] Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs
    Ito, Shinji
    Tsuchiya, Taira
    Honda, Junya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Best-of-Both-Worlds Analysis of Online Search
    Angelopoulos, Spyros
    Durr, Christoph
    Jin, Shendan
    ALGORITHMICA, 2023, 85 (12) : 3766 - 3792
  • [4] Best-of-Both-Worlds Analysis of Online Search
    Spyros Angelopoulos
    Christoph Dürr
    Shendan Jin
    Algorithmica, 2023, 85 : 3766 - 3792
  • [5] A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback
    Masoudian, Saeed
    Zimmert, Julian
    Seldin, Yevgeny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [6] Best-of-Both-Worlds Predictive Approach to Dissociative Chemisorption on Metals
    Powell, Andrew D.
    Gerrits, Nick
    Tchakoua, Theophile
    Somers, Mark F.
    Busnengo, Heriberto F.
    Meyer, Jo''rg
    Kroes, Geert-Jan
    Doblhoff-Dier, Katharina
    JOURNAL OF PHYSICAL CHEMISTRY LETTERS, 2024, 15 (01): : 307 - 315
  • [7] Best-of-Both-Worlds Multiparty Quantum Computation with Publicly Verifiable Identifiable Abort
    Chung, Kai-Min
    Huang, Mi-Ying
    Tang, Er-Cheng
    Zhang, Jiapeng
    ADVANCES IN CRYPTOLOGY, PT VI, EUROCRYPT 2024, 2024, 14656 : 119 - 148
  • [8] Best-of-Both-Worlds Multiparty Quantum Computation with Publicly Verifiable Identifiable Abort
    Chung, Kai-Min
    Huang, Mi-Ying
    Tang, Er-Cheng
    Zhang, Jiapeng
    ADVANCES IN CRYPTOLOGY, PT VII, EUROCRYPT 2024, 2024, 14657 : 119 - 148
  • [9] Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems
    Honda, Junya
    Ito, Shinji
    Tsuchiya, Taira
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 726 - 754
  • [10] A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs
    Rouyer, Chloe
    van der Hoeven, Dirk
    Cesa-Bianchi, Nicolo
    Seldin, Yevgeny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,