Online Reinforcement Learning by Bayesian Inference

被引:0
|
作者
Xia, Zhongpu [1 ]
Zhao, Dongbin [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
关键词
GAUSSIAN-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy evaluation has long been one of the core issues of the online reinforcement learning, especially in the continuous state domain. In this paper, the issue is addressed by employing Gaussian processes to represent the action value function from the probability perspective. By modeling the return as a stochastic variable, the action value function can sequentially update according to observed variables such as state and reward by Bayesian inference during the policy evaluation. The update rule shows that it is a temporal difference learning method with the learning rate determined by the uncertainty of a collected sample. Incorporating the policy evaluation method with the E-greedy action selection method, we propose an online reinforcement learning algorithm referred as to Bayesian-SARSA. It is tested on some benchmark problems and the empirical results verifies its effectiveness.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Online shielding for reinforcement learning
    Bettina Könighofer
    Julian Rudolf
    Alexander Palmisano
    Martin Tappler
    Roderick Bloem
    Innovations in Systems and Software Engineering, 2023, 19 : 379 - 394
  • [42] Online Sparse Reinforcement Learning
    Hao, Botao
    Lattimore, Tor
    Szepesvari, Csaba
    Wang, Mengdi
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 316 - +
  • [43] Online Constrained Bayesian Inference and Learning of Gaussian-Process State-Space Models
    Berntorp, Karl
    Menner, Marcel
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 940 - 945
  • [44] Optimal Bayesian online learning
    Winther, O
    Solla, SA
    THEORETICAL ASPECTS OF NEURAL COMPUTATION: A MULTIDISCIPLINARY PERSPECTIVE, 1998, : 61 - 70
  • [45] A Bayesian Approach to Robust Reinforcement Learning
    Derman, Esther
    Mankowitz, Daniel
    Mann, Timothy
    Mannor, Shie
    35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 648 - 658
  • [46] Bayesian Reinforcement Learning in Factored POMDPs
    Katt, Sammie
    Oliehoek, Frans A.
    Amato, Christopher
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 7 - 15
  • [47] A parallel framework for Bayesian reinforcement learning
    Barrett, Enda
    Duggan, Jim
    Howley, Enda
    CONNECTION SCIENCE, 2014, 26 (01) : 7 - 23
  • [48] Active Bayesian perception and reinforcement learning
    Lepora, Nathan F.
    Martinez-Hernandez, Uriel
    Pezzulo, Giovanni
    Prescott, Tony J.
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 4735 - 4740
  • [49] Cover tree bayesian reinforcement learning
    Tziortziotis, Nikolaos
    Dimitrakakis, Christos
    Blekas, Konstantinos
    Journal of Machine Learning Research, 2014, 15 : 2313 - 2335
  • [50] TRAINABLE, BAYESIAN SYMMETRIES FOR REINFORCEMENT LEARNING
    Lu, Qingmei
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 1079 - 1086