Bayesian Learning of Noisy Markov Decision Processes

被引:4
|
作者
Singh, Sumeetpal S. [1 ]
Chopin, Nicolas [2 ,3 ]
Whiteley, Nick [4 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] CREST ENSAE, Paris, France
[3] HEC Paris, Paris, France
[4] Univ Bristol, Sch Math, Bristol BS8 1TW, Avon, England
关键词
Data augmentation; parameter expansion; Markov Chain Monte Carlo; Markov decision process; Bayesian inference; DATA AUGMENTATION; MODEL; ALGORITHM;
D O I
10.1145/2414416.2414420
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Active learning of dynamic Bayesian networks in Markov decision processes
    Jonsson, Anders
    Barto, Andrew
    ABSTRACTION, REFORMULATION, AND APPROXIMATION, PROCEEDINGS, 2007, 4612 : 273 - +
  • [2] Bayesian Risk Markov Decision Processes
    Lin, Yifan
    Ren, Yuxuan
    Zhou, Enlu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Bayesian Nonparametric Inverse Reinforcement Learning for Switched Markov Decision Processes
    Surana, Amit
    Srivastava, Kunal
    2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 47 - 54
  • [4] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
    Ross, Stephane
    Pineau, Joelle
    Chaib-draa, Brahim
    Kreitmann, Pierre
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770
  • [5] Incorporating Bayesian Networks in Markov Decision Processes
    Faddoul, R.
    Raphael, W.
    Soubra, A-H
    Chateauneuf, A.
    JOURNAL OF INFRASTRUCTURE SYSTEMS, 2013, 19 (04) : 415 - 424
  • [6] Learning to Collaborate in Markov Decision Processes
    Radanovic, Goran
    Devidze, Rati
    Parkes, David C.
    Singla, Adish
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Learning in Constrained Markov Decision Processes
    Singh, Rahul
    Gupta, Abhishek
    Shroff, Ness B.
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
  • [8] Blackwell Online Learning for Markov Decision Processes
    Li, Tao
    Peng, Guanze
    Zhu, Quanyan
    2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
  • [9] Online Learning in Kernelized Markov Decision Processes
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [10] Learning Factored Markov Decision Processes with Unawareness
    Innes, Craig
    Lascarides, Alex
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 123 - 133