A note on the existence of optimal stationary policies for average Markov decision processes with countable states

被引：1

作者：

Xia, Li ^{[1
]}

Guo, Xianping ^{[2
]}

Cao, Xi-Ren ^{[3
]}

机构：

[1] Sun Yat Sen Univ, Sch Business, Guangzhou 510275, Peoples R China

[2] Sun Yat Sen Univ, Sch Math, Guangzhou 510275, Peoples R China

[3] Hong Kong Univ Sci & Technol, Clear Water Bay, Hong Kong, Peoples R China

来源：

AUTOMATICA | 2023年 / 151卷

基金：

中国国家自然科学基金;

关键词：

Markov decision process; Countable states; Optimal stationary policy; Metric space; CHAINS;

D O I：

10.1016/j.automatica.2023.110877

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In many practical stochastic dynamic optimization problems with countable states, the optimal policy possesses certain structural properties. For example, the (s, S) policy in inventory control, the wellknown c mu-rule and the recently discovered c/mu-rule (Xia et al. (2022)) in scheduling of queues. A presumption of such results is that an optimal stationary policy exists. There are many research works regarding to the existence of optimal stationary policies of Markov decision processes with countable state spaces (see, e.g., Bertsekas (2012); Hernandez-Lerma and Lasserre (1996); Puterman (1994); Sennott (1999)). However, these conditions are usually not easy to verify in such optimization problems. In this paper, we study the optimization of long-run average of continuous-time Markov decision processes with countable state spaces. We provide an intuitive approach to prove the existence of an optimal stationary policy. The approach is simply based on compactness of the policy space, with a special designed metric, and the continuity of the long-run average in the space. Our method is capable to handle cost functions unbounded from both above and below, which makes a complementary contribution to the literature work where the cost function is unbounded from only one side. Examples are provided to illustrate the application of our main results.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页数：8

共 50 条

[1] AVERAGE OPTIMAL STATIONARY POLICIES AND LINEAR-PROGRAMMING IN COUNTABLE SPACE MARKOV DECISION-PROCESSES
LASSERRE, JB
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1994, 183 (01) : 233 - 249
[2] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE REWARD MARKOV DECISION-PROCESSES WITH A RECURRENT STATE
CAVAZOSCADENA, R
APPLIED MATHEMATICS AND OPTIMIZATION, 1992, 26 (02): : 171 - 194
[3] A NEW CONDITION FOR THE EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE COST MARKOV DECISION-PROCESSES
SENNOTT, LI
OPERATIONS RESEARCH LETTERS, 1986, 5 (01) : 17 - 23
[4] CONDITIONS FOR EXISTENCE OF AVERAGE AND BLACKWELL OPTIMAL STATIONARY POLICIES IN DENUMERABLE MARKOV DECISION-PROCESSES
LASSERRE, JB
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1988, 136 (02) : 479 - 489
[5] WEAK CONDITIONS FOR THE EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE MARKOV DECISION CHAINS WITH UNBOUNDED COSTS
CAVAZOSCADENA, R
KYBERNETIKA, 1989, 25 (03) : 145 - 156
[6] ON THE EXISTENCE OF STATIONARY OPTIMAL POLICIES IN MARKOV DECISION-MODELS
VANDAWEN, R
SCHAL, M
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1983, 63 (05): : T403 - T404
[7] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN DISCOUNTED MARKOV DECISION-PROCESSES - APPROACHES BY OCCUPATION MEASURES
KURANO, M
KAWAI, M
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1994, 27 (9-10) : 95 - 101
[8] AVERAGE COST OPTIMAL STATIONARY POLICIES IN INFINITE STATE MARKOV DECISION-PROCESSES WITH UNBOUNDED COSTS
SENNOTT, LI
OPERATIONS RESEARCH, 1989, 37 (04) : 626 - 633
[9] NOTE ON MONOTONE OPTIMAL POLICIES FOR MARKOV DECISION-PROCESSES
KALIN, D
MATHEMATICAL PROGRAMMING, 1978, 15 (02) : 220 - 222
[10] Optimal policies for constrained average-cost Markov decision processes
Gonzalez-Hernandez, Juan
Villarreal, Cesar E.
TOP, 2011, 19 (01) : 107 - 120

← 1 2 3 4 5 →