Bayesian Risk Markov Decision Processes

被引:0
|
作者
Lin, Yifan [1 ]
Ren, Yuxuan [1 ]
Zhou, Enlu [1 ]
机构
[1] Georgia Inst Technol, Ind & Syst Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
ROBUST; APPROXIMATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider finite-horizon Markov Decision Processes where parameters, such as transition probabilities, are unknown and estimated from data. The popular distributionally robust approach to addressing the parameter uncertainty can sometimes be overly conservative. In this paper, we propose a new formulation, Bayesian risk Markov decision process (BR-MDP), to address parameter uncertainty in MDPs, where a risk functional is applied in nested form to the expected total cost with respect to the Bayesian posterior distributions of the unknown parameters. The proposed formulation provides more flexible risk attitudes towards parameter uncertainty and takes into account the availability of data in future time stages. To solve the proposed formulation with the conditional value-at-risk (CVaR) risk functional, we propose an efficient approximation algorithm by deriving an analytical approximation of the value function and utilizing the convexity of CVaR. We demonstrate the empirical performance of the BR-MDP formulation and proposed algorithms on a gambler's betting problem and an inventory control problem.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Incorporating Bayesian Networks in Markov Decision Processes
    Faddoul, R.
    Raphael, W.
    Soubra, A-H
    Chateauneuf, A.
    JOURNAL OF INFRASTRUCTURE SYSTEMS, 2013, 19 (04) : 415 - 424
  • [2] Bayesian Learning of Noisy Markov Decision Processes
    Singh, Sumeetpal S.
    Chopin, Nicolas
    Whiteley, Nick
    ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2013, 23 (01):
  • [3] Risk sensitive Markov decision processes
    Marcus, SI
    FernandezGaucherand, E
    HernandezHernandez, D
    Coraluppi, S
    Fard, P
    SYSTEMS AND CONTROL IN THE TWENTY-FIRST CENTURY, 1997, 22 : 263 - 279
  • [4] Active learning of dynamic Bayesian networks in Markov decision processes
    Jonsson, Anders
    Barto, Andrew
    ABSTRACTION, REFORMULATION, AND APPROXIMATION, PROCEEDINGS, 2007, 4612 : 273 - +
  • [5] Research on Dynamic Bayesian Network in the nonhomogenous Markov decision processes
    Heng, XC
    Luo, JJ
    Shao, LP
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 134 - 139
  • [6] An Argument for the Bayesian Control of Partially Observable Markov Decision Processes
    Vargo, Erik
    Cogill, Randy
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (10) : 2796 - 2800
  • [7] Using Linear Programming for Bayesian Exploration in Markov Decision Processes
    Castro, Pablo Samuel
    Precup, Doina
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2436 - 2441
  • [8] A Bayesian Network Approach to Control of Networked Markov Decision Processes
    Adlakha, Sachin
    Lall, Sanjay
    Goldsmith, Andrea
    2008 46TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1-3, 2008, : 446 - +
  • [9] Risk-constrained Markov Decision Processes
    Borkar, Vivek
    Jain, Rahul
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 2664 - 2669
  • [10] Risk-Constrained Markov Decision Processes
    Borkar, Vivek
    Jain, Rahul
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (09) : 2574 - 2579