Smaller World Models for Reinforcement Learning

被引:0
|
作者
Jan Robine
Tobias Uelwer
Stefan Harmeling
机构
[1] Technical University of Dortmund,Department of Computer Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Model-based reinforcement learning; World models; Discrete latent space; VQ-VAE; Atari;
D O I
暂无
中图分类号
学科分类号
摘要
Model-based reinforcement learning algorithms try to learn an agent by training a model that simulates the environment. However, the size of such models tends to be quite large which could be a burden as well. In this paper, we address the question, how we could design a model with fewer parameters than previous model-based approaches while achieving the same performance in the 100 K-interactions regime. For this purpose, we create a world model that combines a vector quantized-variational autoencoder to encode observations and a convolutional long short-term memory to model the dynamics. This is connected to a model-free proximal policy optimization agent to train purely on simulated experience from this world model. Detailed experiments on the Atari environments show that it is possible to reach comparable performance to the SimPLe method with a significantly smaller world model. A series of ablation studies justify our design choices and give additional insights.
引用
收藏
页码:11397 / 11427
页数:30
相关论文
共 50 条
  • [21] Empirical priors for reinforcement learning models
    Gershman, Samuel J.
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2016, 71 : 1 - 6
  • [22] Interaction Models for Multiagent Reinforcement Learning
    Ribeiro, Richardson
    Borges, Andre P.
    Enembreck, Fabricio
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 464 - +
  • [23] Smaller cities in a shrinking world: Learning to thrive without growth
    Ehrenfeucht, Renia
    Mallach, Alan
    JOURNAL OF URBAN AFFAIRS, 2025, 47 (02) : 723 - 724
  • [24] Smaller cities in a shrinking world: Learning to thrive without growth
    Ehrenfeucht, Renia
    Mallach, Alan
    JOURNAL OF URBAN AFFAIRS, 2025, 47 (02) : 723 - 724
  • [25] Understanding world models through multi-step pruning policy via reinforcement learning
    He, Zhiqiang
    Qiu, Wen
    Zhao, Wei
    Shao, Xun
    Liu, Zhi
    INFORMATION SCIENCES, 2025, 686
  • [26] Deep reinforcement learning in World-Earth system models to discover sustainable management strategies
    Strnad, Felix M.
    Barfuss, Wolfram
    Donges, Jonathan F.
    Heitzig, Jobst
    CHAOS, 2019, 29 (12)
  • [27] Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning
    Lutter, Michael
    Silberbauer, Johannes
    Watson, Joe
    Peters, Jan
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4163 - 4170
  • [28] Online Learning and Exploiting Relational Models in Reinforcement Learning
    Croonenborghs, Tom
    Ramon, Jan
    Blockeel, Hendrik
    Bruynooghe, Maurice
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 726 - 731
  • [29] A World Model for Actor–Critic in Reinforcement Learning
    A. I. Panov
    L. A. Ugadiarov
    Pattern Recognition and Image Analysis, 2023, 33 : 467 - 477
  • [30] Structured World Belief for Reinforcement Learning in POMDP
    Singh, Gautam
    Peri, Skand
    Kim, Junghyun
    Kim, Hyunseok
    Ahn, Sungjin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139