Shaped Policy Search for Evolutionary Strategies using Waypoints

被引:0
|
作者
Lekkala, Kiran [1 ]
Itti, Laurent [2 ]
机构
[1] Univ Southern Calif, ILab, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ Southern Calif, ILab, Dept Comp Sci Psychol & NGP, Los Angeles, CA 90089 USA
关键词
D O I
10.1109/ICRA48506.2021.9561607
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we try to improve exploration in Blackbox methods, particularly Evolution strategies (ES), when applied to Reinforcement Learning (RI.) problems where intermediate waypoints/subgoals are available. Since Evolutionary strategies are highly parallelizable, instead of extracting just a scalar cumulative reward, we use the state-action pairs from the trajectories obtained during rollouts/evaluations, to learn the dynamics of the agent. The learnt dynamics are then used in the optimization procedure to speed-up training. Lastly, we show how our proposed approach is universally applicable by presenting results from experiments conducted on Carla driving and UR5 robotic arm simulators.
引用
收藏
页码:9093 / 9100
页数:8
相关论文
共 50 条
  • [41] Interactive Evolutionary Computation Using a Tabu Search Algorithm
    Takenouchi, Hiroshi
    Tokumaru, Masataka
    Muranaka, Noriaki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (03): : 673 - 680
  • [42] Search engine development using evolutionary computation methodologies
    Walker, RL
    Recent Advances in Simulated Evolution and Learning, 2004, 2 : 284 - 306
  • [43] Using evolutionary tools to search for novel psychoactive plants
    Halse-Gramkow, Morten
    Ernst, Madeleine
    Ronsted, Nina
    Dunn, Robert R.
    Saslis-Lagoudakis, C. Haris
    PLANT GENETIC RESOURCES-CHARACTERIZATION AND UTILIZATION, 2016, 14 (04): : 246 - 255
  • [44] Using Symmetry and Evolutionary Search to Minimize Sorting Networks
    Valsalam, Vinod K.
    Miikkulainen, Risto
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 303 - 331
  • [45] Object tracking system using evolutionary agent search
    Inomata T.
    Kimura K.
    Hagiwara M.
    Transactions of the Japanese Society for Artificial Intelligence, 2010, 25 (02) : 272 - 280
  • [46] Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
    Róbert Busa-Fekete
    Balázs Szörényi
    Paul Weng
    Weiwei Cheng
    Eyke Hüllermeier
    Machine Learning, 2014, 97 : 327 - 351
  • [47] Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
    Busa-Fekete, Robert
    Szoerenyi, Balazs
    Weng, Paul
    Cheng, Weiwei
    Huellermeier, Eyke
    MACHINE LEARNING, 2014, 97 (03) : 327 - 351
  • [48] SEARCH STRATEGIES USING SCIENCE CITATION INDEX
    CAWKELL, AE
    CURRENT CONTENTS/LIFE SCIENCES, 1969, 12 (44): : 90 - &
  • [49] Nonconvex Policy Search Using Variational Inequalities
    Zhan, Yusen
    Ammar, Haitham Bou
    Taylor, Matthew E.
    NEURAL COMPUTATION, 2017, 29 (10) : 2800 - 2824
  • [50] Waypoints Guidance of the Nonlinear Helicopter using the SDRE Technique
    Kim, Min-Jae
    Yang, Chang-Deok
    Hong, Ji-Seung
    Kim, Chang-Joo
    TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS A, 2009, 33 (09) : 922 - 929