Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models

被引：0

作者：

Ghadirzadeh, Ali ^{[1
]}

Poklukar, Petra ^{[2
]}

Arndt, Karol ^{[3
]}

Finn, Chelsea ^{[1
]}

Kyrki, Ville ^{[3
]}

Kragic, Danica ^{[2
]}

Bjorkman, Marten ^{[2
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] KTH Royal Inst Technol, Stockholm, Sweden

[3] Aalto Univ, Espoo, Finland

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2022年 / 23卷

关键词：

reinforcement learning; policy search; robot learning; deep generative models; representation learning; PRIMITIVES;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable genera-tive models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basket-ball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.

引用

页数：37

共 50 条

[31] Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning
Aydin, Ayberk
Surer, Elif
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[32] Learning Deep Generative Models for Queuing Systems
Ojeda, Cesar
Cvejoski, Kostadin
Georgiev, Bodgan
Bauckhage, Christian
Schuecker, Jannis
Sanchez, Ramses J.
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9214 - 9222
[33] Robot-Assisted Training in Laparoscopy Using Deep Reinforcement Learning
Tan, Xiaoyu
Chng, Chin-Boon
Su, Ye
Lim, Kah-Bin
Chui, Chee-Kong
IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (02) : 485 - 492
[34] Example-guided learning of stochastic human driving policies using deep reinforcement learning
Emuna, Ran
Duffney, Rotem
Borowsky, Avinoam
Biess, Armin
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (23): : 16791 - 16804
[35] LFQ: Online Learning of Per-flow Queuing Policies using Deep Reinforcement Learning
Bachl, Maximilian
Fabini, Joachim
Zseby, Tanja
PROCEEDINGS OF THE 2020 IEEE 45TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN 2020), 2020, : 417 - 420
[36] Example-guided learning of stochastic human driving policies using deep reinforcement learning
Ran Emuna
Rotem Duffney
Avinoam Borowsky
Armin Biess
Neural Computing and Applications, 2023, 35 : 16791 - 16804
[37] Reinforcement Learning with Deep Energy-Based Policies
Haarnoja, Tuomas
Tang, Haoran
Abbeel, Pieter
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[38] Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Korenkevych, Dmytro
Mahmood, A. Rupam
Vasan, Gautham
Bergstra, James
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2754 - 2762
[39] Boosting Deep Reinforcement Learning Agents with Generative Data Augmentation
Papagiannis, Tasos
Alexandridis, Georgios
Stafylopatis, Andreas
APPLIED SCIENCES-BASEL, 2024, 14 (01):
[40] The State of Sparse Training in Deep Reinforcement Learning
Graesser, Laura
Evci, Utku
Elsen, Erich
Castro, Pablo Samuel
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →