Scaling, Control and Generalization in Reinforcement Learning Level Generators

被引：0

作者：

Earle, Sam ^{[1
]}

Jiang, Zehua ^{[1
]}

Togelius, Julian ^{[1
]}

机构：

[1] NYU, Game Innovat Lab, Brooklyn, NY 11201 USA

来源：

2024 IEEE CONFERENCE ON GAMES, COG 2024 | 2024年

关键词：

procedural content generation; reinforcement learning;

D O I：

10.1109/CoG60054.2024.10645598

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Procedural Content Generation via Reinforcement Learning (PCGRL) has been introduced as a means by which controllable designer agents can be trained based only on a set of computable metrics acting as a proxy for the level's quality and key characteristics. While PCGRL offers a unique set of affordances for game designers, it is constrained by the compute-intensive process of training RL agents, and has so far been limited to generating relatively small levels. To address this issue of scale, we implement several PCGRL environments in Jax so that all aspects of learning and simulation happen in parallel on the GPU, resulting in faster environment simulation; removing the CPU-GPU transfer of information bottleneck during RL training; and ultimately resulting in significantly improved training speed. We replicate several key results from prior works in this new framework, letting models train for much longer than previously studied, and evaluating their behavior after 1 billion timesteps. Aiming for greater control for human designers, we introduce randomized level sizes and frozen "pinpoints" of pivotal game tiles as further ways of countering overfitting. To test the generalization ability of learned generators, we evaluate models on large, out-of-distribution map sizes, and find that partial observation sizes learn more robust design strategies.

引用

页数：8

共 50 条

[21] Improving Generalization in Reinforcement Learning with Mixture Regularization
Wang, Kaixin
Kang, Bingyi
Shao, Jie
Feng, Jiashi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[22] Automatic Data Augmentation for Generalization in Reinforcement Learning
Raileanu, Roberta
Goldstein, Max
Yarats, Denis
Kostrikov, Ilya
Fergus, Rob
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[23] Instance-based Generalization in Reinforcement Learning
Bertran, Martin
Martinez, Natalia
Phielipp, Mariano
Sapiro, Guillermo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[24] Human-level control through deep reinforcement learning
Mnih, Volodymyr
Kavukcuoglu, Koray
Silver, David
Rusu, Andrei A.
Veness, Joel
Bellemare, Marc G.
Graves, Alex
Riedmiller, Martin
Fidjeland, Andreas K.
Ostrovski, Georg
Petersen, Stig
Beattie, Charles
Sadik, Amir
Antonoglou, Ioannis
King, Helen
Kumaran, Dharshan
Wierstra, Daan
Legg, Shane
Hassabis, Demis
NATURE, 2015, 518 (7540) : 529 - 533
[25] Generalization in Reinforcement Learning by Soft Data Augmentation
Hansen, Nicklas
Wang, Xiaolong
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13611 - 13617
[26] Human-level control through deep reinforcement learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Andrei A. Rusu
Joel Veness
Marc G. Bellemare
Alex Graves
Martin Riedmiller
Andreas K. Fidjeland
Georg Ostrovski
Stig Petersen
Charles Beattie
Amir Sadik
Ioannis Antonoglou
Helen King
Dharshan Kumaran
Daan Wierstra
Shane Legg
Demis Hassabis
Nature, 2015, 518 : 529 - 533
[27] Novelty and Inductive Generalization in Human Reinforcement Learning
Gershman, Samuel J.
Niv, Yael
TOPICS IN COGNITIVE SCIENCE, 2015, 7 (03) : 391 - 415
[28] Algebraic Reinforcement Learning Hypothesis Induction for Relational Reinforcement Learning Using Term Generalization
Neubert, Stefanie
Belzner, Lenz
Wirsing, Martin
LOGIC, REWRITING, AND CONCURRENCY, 2015, 9200 : 562 - 579
[29] GeneraLight: Improving Environment Generalization of Traffic Signal Control via Meta Reinforcement Learning
Zhang, Huichu
Liu, Chang
Zhang, Weinan
Zheng, Guanjie
Yu, Yong
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1783 - 1792
[30] Scaling description of generalization with number of parameters in deep learning
Geiger, Mario
Jacot, Arthur
Spigler, Stefano
Gabriel, Franck
Sagun, Levent
d'Ascoli, Stephane
Biroli, Giulio
Hongler, Clement
Wyart, Matthieu
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2020, 2020 (02):

← 1 2 3 4 5 →