Incremental value iteration for time-aggregated Markov-decision processes

被引:22
|
作者
Sun, Tao [1 ]
Zhao, Qianchuan
Luh, Peter B.
机构
[1] Tsing Hua Univ, Ctr Intelligent & Networked Syst CFINS, Dept Automat, Beijing 100084, Peoples R China
[2] Univ Connecticut, Dept Elect & Comp Engn, Storrs, CT 06269 USA
关键词
fractional cost; Markov-decision processes (MDPs); policy iteration; time aggregation; value iteration;
D O I
10.1109/TAC.2007.908359
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A value iteration algorithm, for time-aggregated Markov-decision processes (MDPs) is developed to solve problems with large state spaces. The algorithm is based on a novel approach which solves a time aggregated MDP by incrementally solving a set of standard MDPs. Therefore, the algorithm converges under the same assumption as standard value iteration. Such assumption is much weaker than that required by the existing time aggregated value iteration algorithm. The algorithms developed in this paper are also applicable to MDPs with fractional costs.
引用
收藏
页码:2177 / 2182
页数:6
相关论文
共 50 条
  • [41] Incremental Quantitative Verification for Markov Decision Processes
    Kwiatkowska, Marta
    Parker, David
    Qu, Hongyang
    2011 IEEE/IFIP 41ST INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2011, : 359 - 370
  • [42] Efficient Policy Iteration for Periodic Markov Decision Processes
    Osogami, Takayuki
    Raymond, Rudy
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
  • [43] Evolutionary policy iteration for solving Markov decision processes
    Chang, HS
    Lee, HG
    Fu, MC
    Marcus, SI
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
  • [44] Policy iteration for robust nonstationary Markov decision processes
    Saumya Sinha
    Archis Ghate
    Optimization Letters, 2016, 10 : 1613 - 1628
  • [45] Policy iteration for robust nonstationary Markov decision processes
    Sinha, Saumya
    Ghate, Archis
    OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
  • [46] The Smoothed Complexity of Policy Iteration for Markov Decision Processes
    Christ, Miranda
    Yannakakis, Mihalis
    PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, : 1890 - 1903
  • [47] Policy Iteration for Decentralized Control of Markov Decision Processes
    Bernstein, Daniel S.
    Amato, Christopher
    Hansen, Eric A.
    Zilberstein, Shlomo
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
  • [48] ON THE VALUE-ITERATION IN MARKOV DECISION-MODELS
    SCHAL, M
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1985, 65 (05): : T324 - T325
  • [49] Average optimality for continuous-time Markov decision processes with a policy iteration approach
    Zhu, Quanxin
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2008, 339 (01) : 691 - 704
  • [50] The value functions of Markov decision processes
    Lehrer, Ehud
    Solan, Eilon
    Solan, Omri N.
    OPERATIONS RESEARCH LETTERS, 2016, 44 (05) : 587 - 591