Variational Inference MPC for Bayesian Model-based Reinforcement Learning

被引:0
|
作者
Okada, Masashi [1 ]
Taniguchi, Tadahiro [1 ,2 ]
机构
[1] Panason Corp, Kadoma, Osaka, Japan
[2] Ritsumeikan Univ, Kyoto, Japan
来源
关键词
model predictive control; variational inference; model-based reinforcement learning; PREDICTIVE CONTROL; OPTIMIZATION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent studies on model-based reinforcement learning (MBRL), incorporating uncertainty in forward dynamics is a state-of-the-art strategy to enhance learning performance, making MBRLs competitive to cutting-edge model-free methods, especially in simulated robotics tasks. Probabilistic ensembles with trajectory sampling (PETS) is a leading type of MBRL, which employs Bayesian inference to dynamics modeling and model predictive control (MPC) with stochastic optimization via the cross entropy method (CEM). In this paper, we propose a novel extension to the uncertainty-aware MBRL. Our main contributions are twofold: Firstly, we introduce a variational inference MPC (VI-MPC), which reformulates various stochastic methods, including CEM, in a Bayesian fashion. Secondly, we propose a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS). As a result, our Bayesian MBRL can involve multimodal uncertainties both in dynamics and optimal trajectories. In comparison to PETS, our method consistently improves asymptotic performance on several challenging locomotion tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Model-based reinforcement learning with dimension reduction
    Tangkaratt, Voot
    Morimoto, Jun
    Sugiyama, Masashi
    NEURAL NETWORKS, 2016, 84 : 1 - 16
  • [42] On Effective Scheduling of Model-based Reinforcement Learning
    Lai, Hang
    Shen, Jian
    Zhang, Weinan
    Huang, Yimin
    Zhang, Xing
    Tang, Ruiming
    Yu, Yong
    Li, Zhenguo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Objective Mismatch in Model-based Reinforcement Learning
    Lambert, Nathan
    Amos, Brandon
    Yadan, Omry
    Calandra, Roberto
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 761 - 770
  • [44] Transferring Instances for Model-Based Reinforcement Learning
    Taylor, Matthew E.
    Jong, Nicholas K.
    Stone, Peter
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART II, PROCEEDINGS, 2008, 5212 : 488 - 505
  • [45] A comparison of direct and model-based reinforcement learning
    Atkeson, CG
    Santamaria, JC
    1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 3557 - 3564
  • [46] Modeling Survival in model-based Reinforcement Learning
    Moazami, Saeed
    Doerschuk, Peggy
    2020 SECOND INTERNATIONAL CONFERENCE ON TRANSDISCIPLINARY AI (TRANSAI 2020), 2020, : 17 - 24
  • [47] Adaptive Discretization for Model-Based Reinforcement Learning
    Sinclair, Sean R.
    Wang, Tianyu
    Jain, Gauri
    Banerjee, Siddhartha
    Yu, Christina Lee
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [48] Model-based average reward reinforcement learning
    Tadepalli, P
    Ok, D
    ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 177 - 224
  • [49] Continual Model-Based Reinforcement Learning with Hypernetworks
    Huang, Yizhou
    Xie, Kevin
    Bharadhwaj, Homanga
    Shkurti, Florian
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 799 - 805
  • [50] Model-based Reinforcement Learning and the Eluder Dimension
    Osband, Ian
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27