Bayesian Learning of Noisy Markov Decision Processes

被引：4

作者：

Singh, Sumeetpal S. ^{[1
]}

Chopin, Nicolas ^{[2
,3
]}

Whiteley, Nick ^{[4
]}

机构：

[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England

[2] CREST ENSAE, Paris, France

[3] HEC Paris, Paris, France

[4] Univ Bristol, Sch Math, Bristol BS8 1TW, Avon, England

来源：

ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION | 2013年 / 23卷 / 01期

关键词：

Data augmentation; parameter expansion; Markov Chain Monte Carlo; Markov decision process; Bayesian inference; DATA AUGMENTATION; MODEL; ALGORITHM;

D O I：

10.1145/2414416.2414420

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

引用

页数：25

共 50 条

[1] Active learning of dynamic Bayesian networks in Markov decision processes
Jonsson, Anders
Barto, Andrew
ABSTRACTION, REFORMULATION, AND APPROXIMATION, PROCEEDINGS, 2007, 4612 : 273 - +
[2] Bayesian Risk Markov Decision Processes
Lin, Yifan
Ren, Yuxuan
Zhou, Enlu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[3] Bayesian Nonparametric Inverse Reinforcement Learning for Switched Markov Decision Processes
Surana, Amit
Srivastava, Kunal
2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 47 - 54
[4] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
Ross, Stephane
Pineau, Joelle
Chaib-draa, Brahim
Kreitmann, Pierre
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770
[5] Incorporating Bayesian Networks in Markov Decision Processes
Faddoul, R.
Raphael, W.
Soubra, A-H
Chateauneuf, A.
JOURNAL OF INFRASTRUCTURE SYSTEMS, 2013, 19 (04) : 415 - 424
[6] Learning to Collaborate in Markov Decision Processes
Radanovic, Goran
Devidze, Rati
Parkes, David C.
Singla, Adish
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[7] Learning in Constrained Markov Decision Processes
Singh, Rahul
Gupta, Abhishek
Shroff, Ness B.
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
[8] Blackwell Online Learning for Markov Decision Processes
Li, Tao
Peng, Guanze
Zhu, Quanyan
2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
[9] Online Learning in Kernelized Markov Decision Processes
Chowdhury, Sayak Ray
Gopalan, Aditya
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[10] Learning Factored Markov Decision Processes with Unawareness
Innes, Craig
Lascarides, Alex
35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 123 - 133

← 1 2 3 4 5 →