Sample-path optimality and variance-maximization for Markov decision processes

被引:3
|
作者
Zhu, Q. X. [1 ]
机构
[1] S China Normal Univ, Dept Math, Guangzhou 510631, Peoples R China
关键词
discrete-time Markov decision process; unbounded reward; sample-path reward criterion; variance-maximization; optimal stationary policy;
D O I
10.1007/s00186-006-0126-9
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system's primitive data and under which we prove the existence of ASPR-optimal stationary policies and variance optimal policies. Our conditions are weaker than those in the previous literature. Moreover, our results are illustrated by a controlled queueing system.
引用
收藏
页码:519 / 538
页数:20
相关论文
共 50 条
  • [1] Sample-path optimality and variance-maximization for Markov decision processes
    Q. X. Zhu
    Mathematical Methods of Operations Research, 2007, 65 : 519 - 538
  • [2] Sample-path average optimality for Markov control processes
    Lasserre, JB
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1999, 44 (10) : 1966 - 1971
  • [3] Sample-path optimality and variance-minimization of average cost Markov control processes
    Hernández-Lerma, O
    Vega-Amaya, O
    Carrasco, G
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1999, 38 (01) : 79 - 93
  • [4] Sample-path optimality and variance-minimization of average cost Markov control processes
    Hernández-Lerma, Onésimo
    Vega-Amaya, Oscar
    Carrasco, Guadalupe
    SIAM Journal on Control and Optimization, 38 (01): : 79 - 93
  • [5] Average sample-path optimality for continuous-time Markov decision processes in Polish spaces
    Zhu, Quan-xin
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2011, 27 (04): : 613 - 624
  • [6] Average sample-path optimality for continuous-time Markov decision processes in Polish spaces
    Quan-xin Zhu
    Acta Mathematicae Applicatae Sinica, English Series, 2011, 27 : 613 - 624
  • [7] A Sensitivity-Based Construction Approach to Sample-Path Variance Minimization of Markov Decision Processes
    Huang, Yonghao
    Chen, Xi
    2012 2ND AUSTRALIAN CONTROL CONFERENCE (AUCC), 2012, : 215 - 220
  • [8] A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion
    Rolando Cavazos-Cadena
    Raúl Montes-de-Oca
    Karel Sladký
    Journal of Optimization Theory and Applications, 2014, 163 : 674 - 684
  • [9] A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion
    Cavazos-Cadena, Rolando
    Montes-de-Oca, Raul
    Sladky, Karel
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2014, 163 (02) : 674 - 684
  • [10] Sample-path and variance minimization of Markov control processes with average cost criteria
    Hernández-Lerma, O
    Vega-Amaya, O
    Carrasco, G
    PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 1172 - 1176