Smaller World Models for Reinforcement Learning

被引:0
|
作者
Jan Robine
Tobias Uelwer
Stefan Harmeling
机构
[1] Technical University of Dortmund,Department of Computer Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Model-based reinforcement learning; World models; Discrete latent space; VQ-VAE; Atari;
D O I
暂无
中图分类号
学科分类号
摘要
Model-based reinforcement learning algorithms try to learn an agent by training a model that simulates the environment. However, the size of such models tends to be quite large which could be a burden as well. In this paper, we address the question, how we could design a model with fewer parameters than previous model-based approaches while achieving the same performance in the 100 K-interactions regime. For this purpose, we create a world model that combines a vector quantized-variational autoencoder to encode observations and a convolutional long short-term memory to model the dynamics. This is connected to a model-free proximal policy optimization agent to train purely on simulated experience from this world model. Detailed experiments on the Atari environments show that it is possible to reach comparable performance to the SimPLe method with a significantly smaller world model. A series of ablation studies justify our design choices and give additional insights.
引用
收藏
页码:11397 / 11427
页数:30
相关论文
共 50 条
  • [1] Smaller World Models for Reinforcement Learning
    Robine, Jan
    Uelwer, Tobias
    Harmeling, Stefan
    NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11397 - 11427
  • [2] Deep learning, reinforcement learning, and world models
    Matsuo, Yutaka
    LeCun, Yann
    Sahani, Maneesh
    Precup, Doina
    Silver, David
    Sugiyama, Masashi
    Uchibe, Eiji
    Morimoto, Jun
    NEURAL NETWORKS, 2022, 152 : 267 - 275
  • [3] THE EFFECTIVENESS OF WORLD MODELS FOR CONTINUAL REINFORCEMENT LEARNING
    Kessler, Samuel
    Ostaszewski, Mateusz
    Bortkiewicz, Michal
    Zarski, Mateusz
    Wolczyk, Maciej
    Parker-Holder, Jack
    Roberts, Stephen J.
    Milos, Piotr
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 184 - 204
  • [4] Building relational world models for reinforcement learning
    Walker, Trevor
    Torrey, Lisa
    Shavlik, Jude
    Maclin, Richard
    INDUCTIVE LOGIC PROGRAMMING, 2008, 4894 : 280 - +
  • [5] Exploring the limits of hierarchical world models in reinforcement learning
    Schiewer, Robin
    Subramoney, Anand
    Wiskott, Laurenz
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [6] Reinforcement Learning Soccer Teams with Incomplete World Models
    Marco Wiering
    Rafał Sałustowicz
    Jürgen Schmidhuber
    Autonomous Robots, 1999, 7 : 77 - 88
  • [7] Reinforcement learning soccer teams with incomplete world models
    Wiering, M
    Salustowicz, R
    Schmidhuber, J
    AUTONOMOUS ROBOTS, 1999, 7 (01) : 77 - 88
  • [8] STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
    Zhang, Weipu
    Wang, Gang
    Sun, Jian
    Yuan, Yetian
    Huang, Gao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Offline model-based reinforcement learning with causal structured world models
    Zhu, Zhengmao
    Tian, Honglong
    Chen, Xionghui
    Zhang, Kun
    Yu, Yang
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (04)
  • [10] DreamingV2: Reinforcement Learning with Discrete World Models without Reconstruction
    Okada, Masashi
    Taniguchi, Tadahiro
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 985 - 991