Training and Evaluation of Deep Policies Using Reinforcement Learning and Generative Models

被引:0
|
作者
Ghadirzadeh, Ali [1 ]
Poklukar, Petra [2 ]
Arndt, Karol [3 ]
Finn, Chelsea [1 ]
Kyrki, Ville [3 ]
Kragic, Danica [2 ]
Bjorkman, Marten [2 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] KTH Royal Inst Technol, Stockholm, Sweden
[3] Aalto Univ, Espoo, Finland
关键词
reinforcement learning; policy search; robot learning; deep generative models; representation learning; PRIMITIVES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable genera-tive models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basket-ball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.
引用
收藏
页数:37
相关论文
共 50 条
  • [31] Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning
    Aydin, Ayberk
    Surer, Elif
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [32] Learning Deep Generative Models for Queuing Systems
    Ojeda, Cesar
    Cvejoski, Kostadin
    Georgiev, Bodgan
    Bauckhage, Christian
    Schuecker, Jannis
    Sanchez, Ramses J.
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9214 - 9222
  • [33] Robot-Assisted Training in Laparoscopy Using Deep Reinforcement Learning
    Tan, Xiaoyu
    Chng, Chin-Boon
    Su, Ye
    Lim, Kah-Bin
    Chui, Chee-Kong
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (02) : 485 - 492
  • [34] Example-guided learning of stochastic human driving policies using deep reinforcement learning
    Emuna, Ran
    Duffney, Rotem
    Borowsky, Avinoam
    Biess, Armin
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (23): : 16791 - 16804
  • [35] LFQ: Online Learning of Per-flow Queuing Policies using Deep Reinforcement Learning
    Bachl, Maximilian
    Fabini, Joachim
    Zseby, Tanja
    PROCEEDINGS OF THE 2020 IEEE 45TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN 2020), 2020, : 417 - 420
  • [36] Example-guided learning of stochastic human driving policies using deep reinforcement learning
    Ran Emuna
    Rotem Duffney
    Avinoam Borowsky
    Armin Biess
    Neural Computing and Applications, 2023, 35 : 16791 - 16804
  • [37] Reinforcement Learning with Deep Energy-Based Policies
    Haarnoja, Tuomas
    Tang, Haoran
    Abbeel, Pieter
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [38] Autoregressive Policies for Continuous Control Deep Reinforcement Learning
    Korenkevych, Dmytro
    Mahmood, A. Rupam
    Vasan, Gautham
    Bergstra, James
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2754 - 2762
  • [39] Boosting Deep Reinforcement Learning Agents with Generative Data Augmentation
    Papagiannis, Tasos
    Alexandridis, Georgios
    Stafylopatis, Andreas
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [40] The State of Sparse Training in Deep Reinforcement Learning
    Graesser, Laura
    Evci, Utku
    Elsen, Erich
    Castro, Pablo Samuel
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,