Incremental value iteration for time-aggregated Markov-decision processes

被引：22

作者：

Sun, Tao ^{[1
]}

Zhao, Qianchuan

Luh, Peter B.

机构：

[1] Tsing Hua Univ, Ctr Intelligent & Networked Syst CFINS, Dept Automat, Beijing 100084, Peoples R China

[2] Univ Connecticut, Dept Elect & Comp Engn, Storrs, CT 06269 USA

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2007年 / 52卷 / 11期

关键词：

fractional cost; Markov-decision processes (MDPs); policy iteration; time aggregation; value iteration;

D O I：

10.1109/TAC.2007.908359

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A value iteration algorithm, for time-aggregated Markov-decision processes (MDPs) is developed to solve problems with large state spaces. The algorithm is based on a novel approach which solves a time aggregated MDP by incrementally solving a set of standard MDPs. Therefore, the algorithm converges under the same assumption as standard value iteration. Such assumption is much weaker than that required by the existing time aggregated value iteration algorithm. The algorithms developed in this paper are also applicable to MDPs with fractional costs.

引用

页码：2177 / 2182

页数：6

共 50 条

[21] IntervalMDP. jl: Accelerated Value Iteration for Interval Markov Decision Processes
Mathiesen, Frederik Baymler
Lahijanian, Morteza
Laurenti, Luca
IFAC PAPERSONLINE, 2024, 58 (11): : 1 - 6
[22] Variance reduced value iteration and faster algorithms for solving Markov decision processes
Sidford, Aaron
Wang, Mengdi
Wu, Xian
Ye, Yinyu
NAVAL RESEARCH LOGISTICS, 2023, 70 (05) : 423 - 442
[23] A pause control approach to the value iteration scheme in average Markov decision processes
Cavazos-Cadena, R
SYSTEMS & CONTROL LETTERS, 1998, 33 (04) : 209 - 219
[24] A method for speeding up value iteration in partially observable Markov decision processes
Zhang, NL
Lee, SS
Zhang, WH
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1999, : 696 - 703
[25] ISOTONE POLICIES FOR THE VALUE-ITERATION METHOD FOR MARKOV DECISION-PROCESSES
WHITE, DJ
OR SPEKTRUM, 1984, 6 (04) : 223 - 227
[26] Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes
Sidford, Aaron
Wang, Mengdi
Wu, Xian
Ye, Yinyu
SODA'18: PROCEEDINGS OF THE TWENTY-NINTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2018, : 770 - 787
[27] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
Montes-De-Oca, Raul
Lemus-Rodriguez, Enrique
RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
[28] Value Iteration for Long-Run Average Reward in Markov Decision Processes
Ashok, Pranav
Chatterjee, Krishnendu
Daca, Przemyslaw
Kretinsky, Jan
Meggendorfer, Tobias
COMPUTER AIDED VERIFICATION, CAV 2017, PT I, 2017, 10426 : 201 - 221
[29] Speeding up the convergence of value iteration in partially observable Markov decision processes
Zhang, NL
Zhang, WH
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 29 - 51
[30] A Note on Generalized Second-Order Value Iteration in Markov Decision Processes
Vijesh, Villavarayan Antony
Rudresha, Shreyas Sumithra
Abdulla, Mohammed Shahid
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2023, 199 (03) : 1022 - 1049

← 1 2 3 4 5 →