Smaller World Models for Reinforcement Learning

被引:0
|
作者
Jan Robine
Tobias Uelwer
Stefan Harmeling
机构
[1] Technical University of Dortmund,Department of Computer Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Model-based reinforcement learning; World models; Discrete latent space; VQ-VAE; Atari;
D O I
暂无
中图分类号
学科分类号
摘要
Model-based reinforcement learning algorithms try to learn an agent by training a model that simulates the environment. However, the size of such models tends to be quite large which could be a burden as well. In this paper, we address the question, how we could design a model with fewer parameters than previous model-based approaches while achieving the same performance in the 100 K-interactions regime. For this purpose, we create a world model that combines a vector quantized-variational autoencoder to encode observations and a convolutional long short-term memory to model the dynamics. This is connected to a model-free proximal policy optimization agent to train purely on simulated experience from this world model. Detailed experiments on the Atari environments show that it is possible to reach comparable performance to the SimPLe method with a significantly smaller world model. A series of ablation studies justify our design choices and give additional insights.
引用
收藏
页码:11397 / 11427
页数:30
相关论文
共 50 条
  • [31] Quantum Reinforcement Learning with Quantum World Model
    Zeng, Peigen
    He, Ying
    Yu, F. Richard
    Leung, Victor C. M.
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 3185 - 3190
  • [32] MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models
    Willemsen, Daniel
    Coppola, Mario
    de Croon, Guido C. H. E.
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5635 - 5640
  • [33] When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning
    Zeng, Siliang
    Li, Chenliang
    Garcia, Alfredo
    Hong, Mingyi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [34] Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models
    Westenbroek, Tyler
    Levy, Jacob
    Fridovich-Keil, David
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [35] Transferring task models in Reinforcement Learning agents
    Fachantidis, Anestis
    Partalas, Ioannis
    Tsoumakas, Grigorios
    Vlahavas, Ioannis
    NEUROCOMPUTING, 2013, 107 : 23 - 32
  • [36] Traveling in a smaller world
    Battegay, M
    Manns, MP
    INTERNIST, 2004, 45 (06): : 639 - 640
  • [37] The world is getting smaller
    Schwaitzberg, Steven D.
    SURGICAL ENDOSCOPY AND OTHER INTERVENTIONAL TECHNIQUES, 2012, 26 (03): : 593 - 597
  • [38] When to use parametric models in reinforcement learning?
    van Hasselt, Hado
    Hessel, Matteo
    Aslanides, John
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [39] Video Prediction Models as Rewards for Reinforcement Learning
    Escontrela, Alejandro
    Adeniji, Ademi
    Yan, Wilson
    Jain, Ajay
    Bin Peng, Xue
    Goldberg, Ken
    Lee, Youngwoon
    Hafner, Danijar
    Abbeel, Pieter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [40] Reinforcement learning models for scheduling in wireless networks
    Kok-Lim Alvin Yau
    Kae Hsiang Kwong
    Chong Shen
    Frontiers of Computer Science, 2013, 7 : 754 - 766