Recruitment-imitation mechanism for evolutionary reinforcement learning

被引:23
|
作者
Lu, Shuai [1 ,2 ]
Han, Shuai [1 ,2 ]
Zhou, Wenbo [1 ,2 ]
Zhang, Junwei [1 ,2 ]
机构
[1] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Evolutionary reinforcement learning; Reinforcement learning; Evolutionary algorithms; Imitation learning;
D O I
10.1016/j.ins.2020.12.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning, evolutionary algorithms and imitation learning are three principal methods to deal with continuous control tasks. Reinforcement learning is sample efficient, yet sensitive to hyperparameters settings and needs efficient exploration; Evolutionary algorithms are stable, but with low sample efficiency; Imitation learning is both sample efficient and stable, however it requires the guidance of expert data. In this paper, we propose Recruitment-imitation Mechanism (RIM) for evolutionary reinforcement learning, a scalable framework that combines advantages of the three methods mentioned above. The core of this framework is a dual-actors and single critic reinforcement learning agent. This agent can recruit high-fitness actors from the population performing evolutionary algorithms, which instructs itself to learn from experience replay buffer. At the same time, low-fitness actors in the evolutionary population can imitate behavior patterns of the reinforcement learning agent and promote their fitness level. Reinforcement and imitation learners in this framework can be replaced with any off-policy actor-critic reinforcement learner and data-driven imitation learner. We evaluate RIM on a series of benchmarks for continuous control tasks in Mujoco. The experimental results show that RIM outperforms prior evolutionary or reinforcement learning methods. The performance of RIM's components is significantly better than components of previous evolutionary reinforcement learning algorithm, and the recruitment using soft update enables reinforcement learning agent to learn faster than that using hard update. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:172 / 188
页数:17
相关论文
共 50 条
  • [1] Integration of Evolutionary Computing and Reinforcement Learning for Robotic Imitation Learning
    Tan, Huan
    Balajee, Kannan
    Lynn, DeRose
    2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 407 - 412
  • [2] Imitation and reinforcement learning
    Kober J.
    Peters J.
    IEEE Robotics and Automation Magazine, 2010, 17 (02): : 55 - 62
  • [3] An Efficient Evaluation Mechanism for Evolutionary Reinforcement Learning
    Wu, Xiaoqiang
    Zhu, Qingling
    Lin, Qiuzhen
    Li, Jianqiang
    Chen, Jianyong
    Ming, Zhong
    INTELLIGENT COMPUTING THEORIES AND APPLICATION (ICIC 2022), PT I, 2022, 13393 : 41 - 50
  • [4] Hierarchical Imitation and Reinforcement Learning
    Le, Hoang M.
    Jiang, Nan
    Agarwal, Alekh
    Dudik, Miroslav
    Yue, Yisong
    Daume, Hal, III
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [5] Delayed Reinforcement Learning by Imitation
    Liotet, Pierre
    Maran, Davide
    Bisi, Lorenzo
    Restelli, Marcello
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Attention Guided Imitation Learning and Reinforcement Learning
    Zhang, Ruohan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9906 - 9907
  • [7] A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control
    Kondo, T
    Ito, K
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2004, 46 (02) : 111 - 124
  • [8] Implicit imitation in multiagent reinforcement learning
    Price, B
    Boutilier, C
    MACHINE LEARNING, PROCEEDINGS, 1999, : 325 - 334
  • [9] The evolutionary foundations of learning by imitation in chimpanzees
    Bard, K
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2002, : 40 - 40
  • [10] Optimizing Crop Management with Reinforcement Learning and Imitation Learning
    Tao, Ran
    Zhao, Pan
    Wu, Jing
    Martin, Nicolas
    Harrison, Matthew T.
    Ferreira, Carla
    Kalantari, Zahra
    Hovakimyan, Naira
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6228 - 6236