Incremental value iteration for time-aggregated Markov-decision processes

被引:22
|
作者
Sun, Tao [1 ]
Zhao, Qianchuan
Luh, Peter B.
机构
[1] Tsing Hua Univ, Ctr Intelligent & Networked Syst CFINS, Dept Automat, Beijing 100084, Peoples R China
[2] Univ Connecticut, Dept Elect & Comp Engn, Storrs, CT 06269 USA
关键词
fractional cost; Markov-decision processes (MDPs); policy iteration; time aggregation; value iteration;
D O I
10.1109/TAC.2007.908359
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A value iteration algorithm, for time-aggregated Markov-decision processes (MDPs) is developed to solve problems with large state spaces. The algorithm is based on a novel approach which solves a time aggregated MDP by incrementally solving a set of standard MDPs. Therefore, the algorithm converges under the same assumption as standard value iteration. Such assumption is much weaker than that required by the existing time aggregated value iteration algorithm. The algorithms developed in this paper are also applicable to MDPs with fractional costs.
引用
收藏
页码:2177 / 2182
页数:6
相关论文
共 50 条
  • [31] A Note on Generalized Second-Order Value Iteration in Markov Decision Processes
    Villavarayan Antony Vijesh
    Shreyas Sumithra Rudresha
    Mohammed Shahid Abdulla
    Journal of Optimization Theory and Applications, 2023, 199 : 1022 - 1049
  • [32] A NEW PARALLELIZED OF HIERARCHICAL VALUE ITERATION ALGORITHM FOR DISCOUNTED MARKOV DECISION PROCESSES
    Nachaoui, Mourad
    Chafik, Sanae
    Daoui, Cherki
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S, 2025, 18 (01): : 1 - 14
  • [33] Sketched Newton Value Iteration for Large-Scale Markov Decision Processes
    Liu, Jinsong
    Xie, Chenghan
    Deng, Qi
    Ge, Dongdong
    Ye, Yinyu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13936 - 13944
  • [34] Toward an Optimized Value Iteration Algorithm for Average Cost Markov Decision Processes
    Arruda, Edilson F.
    Ourique, Fabricio
    Almudevar, Anthony
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 930 - 934
  • [35] Geometric Policy Iteration for Markov Decision Processes
    Wu, Yue
    De Loera, Jesus A.
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
  • [36] Policy set iteration for Markov decision processes
    Chang, Hyeong Soo
    AUTOMATICA, 2013, 49 (12) : 3687 - 3689
  • [37] SERIAL AND PARALLEL VALUE-ITERATION ALGORITHMS FOR DISCOUNTED MARKOV DECISION-PROCESSES
    ARCHIBALD, TW
    MCKINNON, KIM
    THOMAS, LC
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1993, 67 (02) : 188 - 203
  • [38] COMPUTATIONAL COMPARISON OF VALUE-ITERATION ALGORITHMS FOR DISCOUNTED MARKOV DECISION-PROCESSES
    THOMAS, LC
    HARLEY, R
    LAVERCOMBE, AC
    OPERATIONS RESEARCH LETTERS, 1983, 2 (02) : 72 - 76
  • [39] An optimistic value iteration for mean-variance optimization in discounted Markov decision processes
    Ma, Shuai
    Ma, Xiaoteng
    Xia, Li
    RESULTS IN CONTROL AND OPTIMIZATION, 2022, 8
  • [40] VALUE ITERATION IN COUNTABLE STATE AVERAGE COST MARKOV DECISION PROCESSES WITH UNBOUNDED COSTS
    Sennott, Linn I.
    ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) : 261 - 271