Probabilistic inference for determining options in reinforcement learning

被引：0

作者：

Christian Daniel

Herke van Hoof

Jan Peters

Gerhard Neumann

机构：

[1] Technische Universität Darmstadt,

[2] Bosch Corporate Research,undefined

[3] Cognitive Systems,undefined

[4] Max-Planck-Institut für Intelligente Systeme,undefined

来源：

Machine Learning | 2016年 / 104卷

关键词：

Reinforcement learning; Robot learning; Options; Semi Markov decision process;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Tasks that require many sequential decisions or complex solutions are hard to solve using conventional reinforcement learning algorithms. Based on the semi Markov decision process setting (SMDP) and the option framework, we propose a model which aims to alleviate these concerns. Instead of learning a single monolithic policy, the agent learns a set of simpler sub-policies as well as the initiation and termination probabilities for each of those sub-policies. While existing option learning algorithms frequently require manual specification of components such as the sub-policies, we present an algorithm which infers all relevant components of the option framework from data. Furthermore, the proposed approach is based on parametric option representations and works well in combination with current policy search methods, which are particularly well suited for continuous real-world tasks. We present results on SMDPs with discrete as well as continuous state-action spaces. The results show that the presented algorithm can combine simple sub-policies to solve complex tasks and can improve learning performance on simpler tasks.

引用

页码：337 / 357

页数：20

共 50 条

[1] Probabilistic inference for determining options in reinforcement learning
Daniel, Christian
van Hoof, Herke
Peters, Jan
Neumann, Gerhard
MACHINE LEARNING, 2016, 104 (2-3) : 337 - 357
[2] Probabilistic Inference in Reinforcement Learning Done Right
Tarbouriech, Jean
Lattimore, Tor
O'Donoghue, Brendan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[3] Understanding Reinforcement Learning Based Localisation as a Probabilistic Inference Algorithm
Yamagata, Taku
Santos-Rodriguez, Raul
Piechocki, Robert
Flack, Peter
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 111 - 122
[4] Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning
Sun, Xudong
Bischl, Bernd
2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 110 - 119
[5] Learning Options in Multiobjective Reinforcement Learning
Bonini, Rodrigo Cesar
da Silva, Felipe Leno
Reali Costa, Anna Helena
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4907 - 4908
[6] Probabilistic learning and inference in schizophrenia
Averbeck, Bruno B.
Evans, Simon
Chouhan, Viraj
Bristow, Eleanor
Shergill, Sukhwinder S.
SCHIZOPHRENIA RESEARCH, 2011, 127 (1-3) : 115 - 122
[7] Reinforcement Learning for Options Trading
Wen, Wen
Yuan, Yuyu
Yang, Jincui
APPLIED SCIENCES-BASEL, 2021, 11 (23):
[8] Reinforcement Learning or Active Inference?
Friston, Karl J.
Daunizeau, Jean
Kiebel, Stefan J.
PLOS ONE, 2009, 4 (07):
[9] Probabilistic reinforcement precludes transitive inference: A preliminary study
Camarena, Hector O.
Garcia-Leal, Oscar
Delgadillo-Orozco, Julieta
Barron, Erick
FRONTIERS IN PSYCHOLOGY, 2023, 14
[10] Transitive inference as probabilistic preference learning
Mannella, Francesco
Pezzulo, Giovanni
PSYCHONOMIC BULLETIN & REVIEW, 2024, : 674 - 689

← 1 2 3 4 5 →