Recruitment-imitation mechanism for evolutionary reinforcement learning

被引:23
|
作者
Lu, Shuai [1 ,2 ]
Han, Shuai [1 ,2 ]
Zhou, Wenbo [1 ,2 ]
Zhang, Junwei [1 ,2 ]
机构
[1] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Evolutionary reinforcement learning; Reinforcement learning; Evolutionary algorithms; Imitation learning;
D O I
10.1016/j.ins.2020.12.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning, evolutionary algorithms and imitation learning are three principal methods to deal with continuous control tasks. Reinforcement learning is sample efficient, yet sensitive to hyperparameters settings and needs efficient exploration; Evolutionary algorithms are stable, but with low sample efficiency; Imitation learning is both sample efficient and stable, however it requires the guidance of expert data. In this paper, we propose Recruitment-imitation Mechanism (RIM) for evolutionary reinforcement learning, a scalable framework that combines advantages of the three methods mentioned above. The core of this framework is a dual-actors and single critic reinforcement learning agent. This agent can recruit high-fitness actors from the population performing evolutionary algorithms, which instructs itself to learn from experience replay buffer. At the same time, low-fitness actors in the evolutionary population can imitate behavior patterns of the reinforcement learning agent and promote their fitness level. Reinforcement and imitation learners in this framework can be replaced with any off-policy actor-critic reinforcement learner and data-driven imitation learner. We evaluate RIM on a series of benchmarks for continuous control tasks in Mujoco. The experimental results show that RIM outperforms prior evolutionary or reinforcement learning methods. The performance of RIM's components is significantly better than components of previous evolutionary reinforcement learning algorithm, and the recruitment using soft update enables reinforcement learning agent to learn faster than that using hard update. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:172 / 188
页数:17
相关论文
共 50 条
  • [21] Imitation-Projected Programmatic Reinforcement Learning
    Verma, Abhinav
    Le, Hoang M.
    Yue, Yisong
    Chaudhuri, Swarat
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] VICARIOUS REINFORCEMENT AND IMITATION IN A VERBAL LEARNING SITUATION
    PHILLIPS, RE
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1968, 76 (4P1): : 669 - &
  • [23] Fast Policy Learning through Imitation and Reinforcement
    Cheng, Ching-An
    Yan, Xinyan
    Wagener, Nolan
    Boots, Byron
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 845 - 855
  • [24] Reinforcement Learning with Imitation for Cavity Filter Tuning
    Lindstahl, Simon
    Lan, Xiaoyu
    2020 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2020, : 1335 - 1340
  • [25] Reinforcement and Imitation Learning for Diverse Visuomotor Skills
    Zhu, Yuke
    Wang, Ziyu
    Merel, Josh
    Rusu, Andrei
    Erez, Tom
    Cabi, Serkan
    Tunyasuvunakool, Saran
    Kramar, Janos
    Hadsell, Raia
    de Freitas, Nando
    Heess, Nicolas
    ROBOTICS: SCIENCE AND SYSTEMS XIV, 2018,
  • [26] Reinforcement learning building control approach harnessing imitation learning
    Dey, Sourav
    Marzullo, Thibault
    Zhang, Xiangyu
    Henze, Gregor
    ENERGY AND AI, 2023, 14
  • [27] Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
    Gros, Timo P.
    Hoeller, Daniel
    Hoffmann, Joerg
    Wolf, Verena
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2020), 2020, 12289 : 11 - 17
  • [28] An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation
    Du, Wanyu
    Ji, Yangfeng
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6012 - 6018
  • [29] Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning
    Li, Ziming
    Kiseleva, Julia
    de Rijke, Maarten
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6722 - 6729
  • [30] Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
    Rashidinejad, Paria
    Zhu, Banghua
    Ma, Cong
    Jiao, Jiantao
    Russell, Stuart
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34