Online Reinforcement Learning by Bayesian Inference

被引：0

作者：

Xia, Zhongpu ^{[1
]}

Zhao, Dongbin ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2015年

关键词：

GAUSSIAN-PROCESSES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Policy evaluation has long been one of the core issues of the online reinforcement learning, especially in the continuous state domain. In this paper, the issue is addressed by employing Gaussian processes to represent the action value function from the probability perspective. By modeling the return as a stochastic variable, the action value function can sequentially update according to observed variables such as state and reward by Bayesian inference during the policy evaluation. The update rule shows that it is a temporal difference learning method with the learning rate determined by the uncertainty of a collected sample. Incorporating the policy evaluation method with the E-greedy action selection method, we propose an online reinforcement learning algorithm referred as to Bayesian-SARSA. It is tested on some benchmark problems and the empirical results verifies its effectiveness.

引用

页数：6

共 50 条

[41] Online shielding for reinforcement learning
Bettina Könighofer
Julian Rudolf
Alexander Palmisano
Martin Tappler
Roderick Bloem
Innovations in Systems and Software Engineering, 2023, 19 : 379 - 394
[42] Online Sparse Reinforcement Learning
Hao, Botao
Lattimore, Tor
Szepesvari, Csaba
Wang, Mengdi
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 316 - +
[43] Online Constrained Bayesian Inference and Learning of Gaussian-Process State-Space Models
Berntorp, Karl
Menner, Marcel
2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 940 - 945
[44] Optimal Bayesian online learning
Winther, O
Solla, SA
THEORETICAL ASPECTS OF NEURAL COMPUTATION: A MULTIDISCIPLINARY PERSPECTIVE, 1998, : 61 - 70
[45] A Bayesian Approach to Robust Reinforcement Learning
Derman, Esther
Mankowitz, Daniel
Mann, Timothy
Mannor, Shie
35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 648 - 658
[46] Bayesian Reinforcement Learning in Factored POMDPs
Katt, Sammie
Oliehoek, Frans A.
Amato, Christopher
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 7 - 15
[47] A parallel framework for Bayesian reinforcement learning
Barrett, Enda
Duggan, Jim
Howley, Enda
CONNECTION SCIENCE, 2014, 26 (01) : 7 - 23
[48] Active Bayesian perception and reinforcement learning
Lepora, Nathan F.
Martinez-Hernandez, Uriel
Pezzulo, Giovanni
Prescott, Tony J.
2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 4735 - 4740
[49] Cover tree bayesian reinforcement learning
Tziortziotis, Nikolaos
Dimitrakakis, Christos
Blekas, Konstantinos
Journal of Machine Learning Research, 2014, 15 : 2313 - 2335
[50] TRAINABLE, BAYESIAN SYMMETRIES FOR REINFORCEMENT LEARNING
Lu, Qingmei
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 1079 - 1086

← 1 2 3 4 5 →