Optimal time-abstract schedulers for CTMDPs and continuous-time Markov games

被引:7
|
作者
Rabe, Markus N. [1 ]
Schewe, Sven [1 ]
机构
[1] Univ Liverpool, Liverpool L69 3BX, Merseyside, England
基金
英国工程与自然科学研究理事会;
关键词
Continuous-time Markov decision processes; Continuous-time Markov games; Optimal control; Time-bounded reachability; BOUNDED REACHABILITY;
D O I
10.1016/j.tcs.2012.10.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study time-bounded reachability in continuous-time Markov decision processes (CTMDPs) and games (CTGs) for time-abstract scheduler classes. Reachability problems play a paramount role in probabilistic model checking. Consequently, their analysis has been studied intensively, and approximation techniques are well understood. From a mathematical point of view, however, the question of approximation is secondary compared to the fundamental question whether or not optimal control exists. In this article, we demonstrate the existence of optimal schedulers for the time-abstract scheduler classes for CTMDPs. For CTGs, we distinguish two cases: the simple case where both players face the same restriction to use time-abstract strategies (symmetric CTGs) and the case where one player is a completely informed adversary (asymmetric CTGs). While for the former case optimal strategies exist, we prove that for asymmetric CTGs there is not necessarily a scheduler that attains the optimum. It turns out that for CTMDPs and symmetric CTGs optimal time-abstract schedulers have an amazingly simple structure: they converge to a memoryless scheduling policy after a finite number of steps. This allows us to compute time-abstract strategies with finite memory. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:53 / 67
页数:15
相关论文
共 50 条
  • [31] Traffic-signal control reinforcement learning approach for continuous-time Markov games
    Aragon-Gomez, Roman
    Clempner, Julio B.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 89
  • [32] Zero-Sum Continuous-Time Markov Games with One-Side Stopping
    Yurii Averboukh
    Journal of the Operations Research Society of China, 2024, 12 (1) : 169 - 187
  • [33] Nonzero-sum games for continuous-time Markov chains with unbounded discounted payoffs
    Guo, XP
    Hernández-Lerma, O
    JOURNAL OF APPLIED PROBABILITY, 2005, 42 (02) : 303 - 320
  • [34] Zero-Sum Continuous-Time Markov Games with One-Side Stopping
    Averboukh, Yurii
    JOURNAL OF THE OPERATIONS RESEARCH SOCIETY OF CHINA, 2024, 12 (01) : 169 - 187
  • [35] Solving traffic queues at controlled-signalized intersections in continuous-time Markov games
    Castillo Gonzalez, Rodrigo
    Clempner, Julio B.
    Poznyak, Alexander S.
    MATHEMATICS AND COMPUTERS IN SIMULATION, 2019, 166 : 283 - 297
  • [36] OPTIMAL CONTROL OF CONTINUOUS-TIME MARKOV CHAINS WITH NOISE-FREE OBSERVATION
    Calvia, A.
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2018, 56 (03) : 2000 - 2035
  • [37] Continuous-time stochastic games with time-bounded reachability
    Brazdil, Tomas
    Forejt, Vojtech
    Krcal, Jan
    Kretinsky, Jan
    Kucera, Antonin
    INFORMATION AND COMPUTATION, 2013, 224 : 46 - 70
  • [38] Efficient Continuous-Time Markov Chain Estimation
    Hajiaghayi, Monir
    Kirkpatrick, Bonnie
    Wang, Liangliang
    Bouchard-Cote, Alexandre
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 1), 2014, 32
  • [39] Ergodic degrees for continuous-time Markov chains
    Mao, YH
    SCIENCE IN CHINA SERIES A-MATHEMATICS, 2004, 47 (02): : 161 - 174
  • [40] Ergodic degrees for continuous-time Markov chains
    MAO YonghuaDepartment of Mathematics
    Science China Mathematics, 2004, (02) : 161 - 174