Universal complexity bounds based on value iteration for stochastic mean payoff games and entropy games

被引:0
|
作者
Allamigeon, Xavier [1 ,2 ]
Gaubert, Stephane [1 ,2 ]
Katz, Ricardo D. [3 ]
Skomra, Mateusz [4 ]
机构
[1] CNRS, Ecole Polytech, INRIA, IP Paris, Palaiseau, France
[2] CNRS, Ecole Polytech, CMAP, IP Paris, Palaiseau, France
[3] Consejo Nacl Invest Cient & Tecn, CIFASIS, Bv 27 Febrero 210 bis, RA-2000 Rosario, Argentina
[4] Univ Toulouse, CNRS, LAAS, Toulouse, France
关键词
Mean-payoff games; Entropy games; Value iteration; Perron root; Separation bounds; Parameterized complexity; DYNAMIC-PROGRAMMING RECURSIONS; PERFECT INFORMATION; OPERATOR APPROACH; ALGORITHM; EXPANSIONS; NUMBERS;
D O I
10.1016/j.ic.2024.105236
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We develop value iteration-based algorithms to solve in a unified manner different classes of combinatorial zero-sum games with mean-payoff type rewards. These algorithms rely on an oracle, evaluating the dynamic programming operator up to a given precision. We show that the number of calls to the oracle needed to determine exact optimal (positional) strategies is, up to a factor polynomial in the dimension, of order R/ sep, where the "separation" sep is defined as the minimal difference between distinct values arising from strategies, and R is a metric estimate, involving the norm of approximate sub and supereigenvectors of the dynamic programming operator. We illustrate this method by two applications. The first one is a new proof, leading to improved complexity estimates, of a theorem of Boros, Elbassioni, Gurvich and Makino, showing that turn-based mean-payoff games with a fixed number of random positions can be solved in pseudo-polynomial time. The second one concerns entropy games, a model introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. The rank of an entropy game is defined as the maximal rank among all the ambiguity matrices determined by strategies of the two players. We show that entropy games with a fixed rank, in their original formulation, can be solved in polynomial time, and that an extension of entropy games incorporating weights can be solved in pseudo-polynomial time under the same fixed rank condition. (c) 2024 Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
引用
收藏
页数:31
相关论文
共 50 条
  • [21] On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games
    Endre Boros
    Khaled Elbassioni
    Vladimir Gurvich
    Kazuhisa Makino
    Dynamic Games and Applications, 2013, 3 : 128 - 161
  • [22] Mean-Payoff Pushdown Games
    Chatterjee, Krishnendu
    Velner, Yaron
    2012 27TH ANNUAL ACM/IEEE SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE (LICS), 2012, : 195 - 204
  • [23] Potential theory for mean payoff games
    Lifshits Y.M.
    Pavlov D.S.
    Journal of Mathematical Sciences, 2007, 145 (3) : 4967 - 4974
  • [24] Mean-payoff parity games
    Chatterjee, K
    Henzinger, TA
    Jurdzinski, M
    LICS 2005: 20th Annual IEEE Symposium on Logic in Computer Science - Proceedings, 2005, : 178 - 187
  • [25] Stochastic Games with Average Payoff Criterion
    M. K. Ghosh
    A. Bagchi
    Applied Mathematics and Optimization, 1998, 38 : 283 - 301
  • [26] Stochastic Games with General Payoff Functions
    Flesch, Janos
    Solan, Eilon
    MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (03) : 1349 - 1371
  • [27] Stochastic games with average payoff criterion
    Ghosh, MK
    Bagchi, A
    APPLIED MATHEMATICS AND OPTIMIZATION, 1998, 38 (03): : 283 - 301
  • [28] The Complexity of Solving Reachability Games Using Value and Strategy Iteration
    Kristoffer Arnsfelt Hansen
    Rasmus Ibsen-Jensen
    Peter Bro Miltersen
    Theory of Computing Systems, 2014, 55 : 380 - 403
  • [29] The Complexity of Solving Reachability Games Using Value and Strategy Iteration
    Hansen, Kristoffer Arnsfelt
    Ibsen-Jensen, Rasmus
    Miltersen, Peter Bro
    THEORY OF COMPUTING SYSTEMS, 2014, 55 (02) : 380 - 403
  • [30] Deterministic priority mean-payoff games as limits of discounted games
    Gimbert, Hugo
    Zielonka, Wieslaw
    AUTOMATA, LANGAGES AND PROGRAMMING, PT 2, 2006, 4052 : 312 - 323