Online Reinforcement Learning by Bayesian Inference

被引：0

作者：

Xia, Zhongpu ^{[1
]}

Zhao, Dongbin ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

来源：

2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2015年

关键词：

GAUSSIAN-PROCESSES;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Policy evaluation has long been one of the core issues of the online reinforcement learning, especially in the continuous state domain. In this paper, the issue is addressed by employing Gaussian processes to represent the action value function from the probability perspective. By modeling the return as a stochastic variable, the action value function can sequentially update according to observed variables such as state and reward by Bayesian inference during the policy evaluation. The update rule shows that it is a temporal difference learning method with the learning rate determined by the uncertainty of a collected sample. Incorporating the policy evaluation method with the E-greedy action selection method, we propose an online reinforcement learning algorithm referred as to Bayesian-SARSA. It is tested on some benchmark problems and the empirical results verifies its effectiveness.

引用

页数：6

共 50 条

[31] Kernel-based direct policy search reinforcement learning based on variational Bayesian inference
Yamaguchi, Nobuhiko
Fukuda, Osamu
Okumura, Hiroshi
2019 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2019), 2019, : 184 - 187
[32] Reinforcement learning-trained optimisers and Bayesian optimisation for online particle accelerator tuning
Kaiser, Jan
Xu, Chenran
Eichler, Annika
Garcia, Andrea Santamaria
Stein, Oliver
Bruendermann, Erik
Kuropka, Willi
Dinter, Hannes
Mayet, Frank
Vinatier, Thomas
Burkart, Florian
Schlarb, Holger
SCIENTIFIC REPORTS, 2024, 14 (01):
[33] Gene Networks Inference by Reinforcement Learning
Bonini, Rodrigo Cesar
Martins-, David Correa, Jr.
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2023, 2023, 13954 : 136 - 147
[34] Reinforcement Learning Based Online Request Scheduling Framework for Workload-Adaptive Edge Deep Learning Inference
Tan, Xinrui
Li, Hongjia
Xie, Xiaofei
Guo, Lu
Ansari, Nirwan
Huang, Xueqing
Wang, Liming
Xu, Zhen
Liu, Yang
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 13222 - 13239
[35] Inference and learning in fuzzy Bayesian networks
Baldwin, JF
Di Tomaso, E
PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 630 - 635
[36] On Sequential Bayesian Inference for Continual Learning
Kessler, Samuel
Cobb, Adam
Rudner, Tim G. J.
Zohren, Stefan
Roberts, Stephen J.
ENTROPY, 2023, 25 (06)
[37] Subspace Inference for Bayesian Deep Learning
Izmailov, Pavel
Maddox, Wesley J.
Kirichenko, Polina
Garipov, Timur
Vetrov, Dmitry
Wilson, Andrew Gordon
35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 1169 - 1179
[38] Collapsed Inference for Bayesian Deep Learning
Zeng, Zhe
Van den Broeck, Guy
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[39] Online testing with reinforcement learning
Veanes, Margus
Roy, Pritam
Campbell, Colin
FORMAL APPROACHES TO SOFTWARE TESTING AND RUNTIME VERIFICATION, 2006, 4262 : 240 - +
[40] Online shielding for reinforcement learning
Koenighofer, Bettina
Rudolf, Julian
Palmisano, Alexander
Tappler, Martin
Bloem, Roderick
INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (04) : 379 - 394

← 1 2 3 4 5 →