Online Reinforcement Learning by Bayesian Inference

被引:0
|
作者
Xia, Zhongpu [1 ]
Zhao, Dongbin [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
关键词
GAUSSIAN-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy evaluation has long been one of the core issues of the online reinforcement learning, especially in the continuous state domain. In this paper, the issue is addressed by employing Gaussian processes to represent the action value function from the probability perspective. By modeling the return as a stochastic variable, the action value function can sequentially update according to observed variables such as state and reward by Bayesian inference during the policy evaluation. The update rule shows that it is a temporal difference learning method with the learning rate determined by the uncertainty of a collected sample. Incorporating the policy evaluation method with the E-greedy action selection method, we propose an online reinforcement learning algorithm referred as to Bayesian-SARSA. It is tested on some benchmark problems and the empirical results verifies its effectiveness.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Kernel-based direct policy search reinforcement learning based on variational Bayesian inference
    Yamaguchi, Nobuhiko
    Fukuda, Osamu
    Okumura, Hiroshi
    2019 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2019), 2019, : 184 - 187
  • [32] Reinforcement learning-trained optimisers and Bayesian optimisation for online particle accelerator tuning
    Kaiser, Jan
    Xu, Chenran
    Eichler, Annika
    Garcia, Andrea Santamaria
    Stein, Oliver
    Bruendermann, Erik
    Kuropka, Willi
    Dinter, Hannes
    Mayet, Frank
    Vinatier, Thomas
    Burkart, Florian
    Schlarb, Holger
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [33] Gene Networks Inference by Reinforcement Learning
    Bonini, Rodrigo Cesar
    Martins-, David Correa, Jr.
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2023, 2023, 13954 : 136 - 147
  • [34] Reinforcement Learning Based Online Request Scheduling Framework for Workload-Adaptive Edge Deep Learning Inference
    Tan, Xinrui
    Li, Hongjia
    Xie, Xiaofei
    Guo, Lu
    Ansari, Nirwan
    Huang, Xueqing
    Wang, Liming
    Xu, Zhen
    Liu, Yang
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 13222 - 13239
  • [35] Inference and learning in fuzzy Bayesian networks
    Baldwin, JF
    Di Tomaso, E
    PROCEEDINGS OF THE 12TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, 2003, : 630 - 635
  • [36] On Sequential Bayesian Inference for Continual Learning
    Kessler, Samuel
    Cobb, Adam
    Rudner, Tim G. J.
    Zohren, Stefan
    Roberts, Stephen J.
    ENTROPY, 2023, 25 (06)
  • [37] Subspace Inference for Bayesian Deep Learning
    Izmailov, Pavel
    Maddox, Wesley J.
    Kirichenko, Polina
    Garipov, Timur
    Vetrov, Dmitry
    Wilson, Andrew Gordon
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 1169 - 1179
  • [38] Collapsed Inference for Bayesian Deep Learning
    Zeng, Zhe
    Van den Broeck, Guy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Online testing with reinforcement learning
    Veanes, Margus
    Roy, Pritam
    Campbell, Colin
    FORMAL APPROACHES TO SOFTWARE TESTING AND RUNTIME VERIFICATION, 2006, 4262 : 240 - +
  • [40] Online shielding for reinforcement learning
    Koenighofer, Bettina
    Rudolf, Julian
    Palmisano, Alexander
    Tappler, Martin
    Bloem, Roderick
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (04) : 379 - 394