Bayesian Risk Markov Decision Processes

被引:0
|
作者
Lin, Yifan [1 ]
Ren, Yuxuan [1 ]
Zhou, Enlu [1 ]
机构
[1] Georgia Inst Technol, Ind & Syst Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
ROBUST; APPROXIMATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider finite-horizon Markov Decision Processes where parameters, such as transition probabilities, are unknown and estimated from data. The popular distributionally robust approach to addressing the parameter uncertainty can sometimes be overly conservative. In this paper, we propose a new formulation, Bayesian risk Markov decision process (BR-MDP), to address parameter uncertainty in MDPs, where a risk functional is applied in nested form to the expected total cost with respect to the Bayesian posterior distributions of the unknown parameters. The proposed formulation provides more flexible risk attitudes towards parameter uncertainty and takes into account the availability of data in future time stages. To solve the proposed formulation with the conditional value-at-risk (CVaR) risk functional, we propose an efficient approximation algorithm by deriving an analytical approximation of the value function and utilizing the convexity of CVaR. We demonstrate the empirical performance of the BR-MDP formulation and proposed algorithms on a gambler's betting problem and an inventory control problem.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Off-Policy Risk Assessment for Markov Decision Processes
    Huang, Audrey
    Liu Leqi
    Lipton, Zachary C.
    Azizzadenesheli, Kamyar
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [32] Risk-averse dynamic programming for Markov decision processes
    Andrzej Ruszczyński
    Mathematical Programming, 2010, 125 : 235 - 261
  • [33] Markov decision processes with risk-sensitive criteria: an overview
    Baeuerle, Nicole
    Jaskiewicz, Anna
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2024, 99 (1-2) : 141 - 178
  • [34] RISK-SENSITIVE AVERAGE OPTIMALITY IN MARKOV DECISION PROCESSES
    Sladky, Karel
    KYBERNETIKA, 2018, 54 (06) : 1218 - 1230
  • [35] Risk-averse dynamic programming for Markov decision processes
    Ruszczynski, Andrzej
    MATHEMATICAL PROGRAMMING, 2010, 125 (02) : 235 - 261
  • [36] Stochastic Approximation for Risk-Aware Markov Decision Processes
    Huang, Wenjie
    Haskell, William B.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (03) : 1314 - 1320
  • [37] On Risk-Sensitive Piecewise Deterministic Markov Decision Processes
    Guo, Xin
    Zhang, Yi
    APPLIED MATHEMATICS AND OPTIMIZATION, 2020, 81 (03): : 685 - 710
  • [38] Distributionally Robust Markov Decision Processes and Their Connection to Risk Measures
    Baeuerle, Nicole
    Glauner, Alexander
    MATHEMATICS OF OPERATIONS RESEARCH, 2021, 47 (03) : 1757 - 1780
  • [39] Markov Decision Processes with Average-Value-at-Risk criteria
    Nicole Bäuerle
    Jonathan Ott
    Mathematical Methods of Operations Research, 2011, 74 : 361 - 379
  • [40] Markov Decision Processes with Average-Value-at-Risk criteria
    Baeuerle, Nicole
    Ott, Jonathan
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2011, 74 (03) : 361 - 379