Regret minimization under partial monitoring

被引：0

作者：

Cesa-Bianchi, Nicolo ^{[1
]}

Lugosi, Gabor ^{[2
]}

Stoltz, Gilles ^{[3
]}

机构：

[1] Univ Milan, Dipartimento Sci Informaz, I-20135 Milan, Italy

[2] Pompeu Fabra Univ, Dept Econ, Barcelona, Spain

[3] Ecole Normale Super, Dept Math Appl, Paris, France

来源：

2006 IEEE INFORMATION THEORY WORKSHOP | 2006年

关键词：

D O I：

10.1109/ITW.2006.1633784

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for these games, that is, randomized playing strategies whose per-round regret vanishes with probability one as the number of game rounds goes to infinity. We prove a general lower bound for the convergence rate of the regret, and exhibit a specific strategy that attains this rate for any game for which a Hannan consistent player exists.

引用

页码：72 / +

页数：2

共 50 条

[1] Regret minimization under partial monitoring
Cesa-Bianchi, Nicolo
Lugosi, Gabor
Stoltz, Gilles
MATHEMATICS OF OPERATIONS RESEARCH, 2006, 31 (03) : 562 - 580
[2] A PDE approach for regret bounds under partial monitoring
Bayraktar, Erhan
Ekren, Ibrahim
Zhang, Xin
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[3] Regret Bounds and Minimax Policies under Partial Monitoring
Audibert, Jean-Yves
Bubeck, Sebastien
JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 2785 - 2836
[4] Regret bounds and minimax policies under partial monitoring
Audibert, Jean-Yves
Bubeck, Sébastien
Journal of Machine Learning Research, 2010, 11 : 2785 - 2863
[5] Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret
Lattimore, Tor
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
[6] Regret Minimization in Billboard Advertisement under Zonal Influence Constraint
Ali, Dildar
Banerjee, Suman
Prasad, Yamuna
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 329 - 336
[7] Partial Monitoring-Classification, Regret Bounds, and Algorithms
Bartok, Gabor
Foster, Dean P.
Pal, David
Rakhlin, Alexander
Szepesvari, Csaba
MATHEMATICS OF OPERATIONS RESEARCH, 2014, 39 (04) : 967 - 997
[8] Regret minimization in online Bayesian persuasion: Handling adversarial receiver?s types under full and partial feedback models
Castiglioni, Matteo
Celli, Andrea
Marchesi, Alberto
Gatti, Nicola
ARTIFICIAL INTELLIGENCE, 2023, 314
[9] Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence
Cohen, Johanne
Heliou, Amelie
Mertikopoulos, Panayotis
ALGORITHMIC GAME THEORY (SAGT 2017), 2017, 10504 : 252 - 263
[10] UTILITY MAXIMIZATION VS REGRET MINIMIZATION: CHOICE BEHAVIOR UNDER UNCERTAINTY
Jiao, X.
van Cranenburgh, S.
Gu, N. Y.
VALUE IN HEALTH, 2023, 26 (06) : S401 - S402

← 1 2 3 4 5 →