Probabilistic inference for determining options in reinforcement learning

被引:0
|
作者
Christian Daniel
Herke van Hoof
Jan Peters
Gerhard Neumann
机构
[1] Technische Universität Darmstadt,
[2] Bosch Corporate Research,undefined
[3] Cognitive Systems,undefined
[4] Max-Planck-Institut für Intelligente Systeme,undefined
来源
Machine Learning | 2016年 / 104卷
关键词
Reinforcement learning; Robot learning; Options; Semi Markov decision process;
D O I
暂无
中图分类号
学科分类号
摘要
Tasks that require many sequential decisions or complex solutions are hard to solve using conventional reinforcement learning algorithms. Based on the semi Markov decision process setting (SMDP) and the option framework, we propose a model which aims to alleviate these concerns. Instead of learning a single monolithic policy, the agent learns a set of simpler sub-policies as well as the initiation and termination probabilities for each of those sub-policies. While existing option learning algorithms frequently require manual specification of components such as the sub-policies, we present an algorithm which infers all relevant components of the option framework from data. Furthermore, the proposed approach is based on parametric option representations and works well in combination with current policy search methods, which are particularly well suited for continuous real-world tasks. We present results on SMDPs with discrete as well as continuous state-action spaces. The results show that the presented algorithm can combine simple sub-policies to solve complex tasks and can improve learning performance on simpler tasks.
引用
收藏
页码:337 / 357
页数:20
相关论文
共 50 条
  • [1] Probabilistic inference for determining options in reinforcement learning
    Daniel, Christian
    van Hoof, Herke
    Peters, Jan
    Neumann, Gerhard
    MACHINE LEARNING, 2016, 104 (2-3) : 337 - 357
  • [2] Probabilistic Inference in Reinforcement Learning Done Right
    Tarbouriech, Jean
    Lattimore, Tor
    O'Donoghue, Brendan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Understanding Reinforcement Learning Based Localisation as a Probabilistic Inference Algorithm
    Yamagata, Taku
    Santos-Rodriguez, Raul
    Piechocki, Robert
    Flack, Peter
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 111 - 122
  • [4] Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning
    Sun, Xudong
    Bischl, Bernd
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 110 - 119
  • [5] Learning Options in Multiobjective Reinforcement Learning
    Bonini, Rodrigo Cesar
    da Silva, Felipe Leno
    Reali Costa, Anna Helena
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4907 - 4908
  • [6] Probabilistic learning and inference in schizophrenia
    Averbeck, Bruno B.
    Evans, Simon
    Chouhan, Viraj
    Bristow, Eleanor
    Shergill, Sukhwinder S.
    SCHIZOPHRENIA RESEARCH, 2011, 127 (1-3) : 115 - 122
  • [7] Reinforcement Learning for Options Trading
    Wen, Wen
    Yuan, Yuyu
    Yang, Jincui
    APPLIED SCIENCES-BASEL, 2021, 11 (23):
  • [8] Reinforcement Learning or Active Inference?
    Friston, Karl J.
    Daunizeau, Jean
    Kiebel, Stefan J.
    PLOS ONE, 2009, 4 (07):
  • [9] Probabilistic reinforcement precludes transitive inference: A preliminary study
    Camarena, Hector O.
    Garcia-Leal, Oscar
    Delgadillo-Orozco, Julieta
    Barron, Erick
    FRONTIERS IN PSYCHOLOGY, 2023, 14
  • [10] Transitive inference as probabilistic preference learning
    Mannella, Francesco
    Pezzulo, Giovanni
    PSYCHONOMIC BULLETIN & REVIEW, 2024, : 674 - 689