Approximation of Stationary Control Policies by Quantized Control in Markov Decision Processes

被引：0

作者：

Saldi, Noel ^{[1
]}

Linder, Tamas ^{[1
]}

Yueksel, Serdar ^{[1
]}

机构：

[1] Queens Univ, Dept Math & Stat, Kingston, ON K7L 3N6, Canada

来源：

2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON) | 2013年

关键词：

FINITE-STATE APPROXIMATIONS; SPACE;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider the problem of approximating optimal stationary control policies by quantized control. Stationary quantizer policies are introduced and it is shown that such policies are epsilon-optimal among stationary policies under mild technical conditions. Quantitative bounds on the approximation error in terms of the rate of the approximating quantizers are also derived. Thus, one can search for epsilon-optimal policies within quantized control policies. These pave the way for applications in optimal design of networked control systems where controller actions need to be quantized, as well as for a new computational method for the generation of approximately optimal Markov decision policies in general (Borel) state and action spaces for both discounted cost and average cost infinite horizon optimal control problems.

引用

页码：78 / 84

页数：7

共 50 条

[31] Policy Iteration for Decentralized Control of Markov Decision Processes
Bernstein, Daniel S.
Amato, Christopher
Hansen, Eric A.
Zilberstein, Shlomo
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
[32] Optimal Decision Tree Policies for Markov Decision Processes
Vos, Daniel
Verwer, Sicco
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
[33] ''Super-overtaking'' optimal policies for Markov control processes
Gordienko, E
SYSTEMS & CONTROL LETTERS, 1997, 31 (01) : 59 - 64
[34] Approximation and estimation in Markov control processes under a discounted criterion
Minjárez-Sosa, J
KYBERNETIKA, 2004, 40 (06) : 681 - 690
[35] An approximation approach to ergodic semi-Markov control processes
Jaśkiewicz A.
Mathematical Methods of Operations Research, 2001, 54 (01) : 1 - 19
[36] Policy gradient Stochastic approximation algorithms for adaptive control of constrained time varying Markov decision processes
Abad, FJV
Krishnamurthy, V
42ND IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-6, PROCEEDINGS, 2003, : 2823 - 2828
[37] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN DISCOUNTED MARKOV DECISION-PROCESSES - APPROACHES BY OCCUPATION MEASURES
KURANO, M
KAWAI, M
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1994, 27 (9-10) : 95 - 101
[38] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE REWARD MARKOV DECISION-PROCESSES WITH A RECURRENT STATE
CAVAZOSCADENA, R
APPLIED MATHEMATICS AND OPTIMIZATION, 1992, 26 (02): : 171 - 194
[39] A NEW CONDITION FOR THE EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE COST MARKOV DECISION-PROCESSES
SENNOTT, LI
OPERATIONS RESEARCH LETTERS, 1986, 5 (01) : 17 - 23
[40] Computing semi-stationary optimal policies for multichain semi-Markov decision processes
Prasenjit Mondal
Annals of Operations Research, 2020, 287 : 843 - 865

← 1 2 3 4 5 →