Recruitment-imitation mechanism for evolutionary reinforcement learning

被引:23
|
作者
Lu, Shuai [1 ,2 ]
Han, Shuai [1 ,2 ]
Zhou, Wenbo [1 ,2 ]
Zhang, Junwei [1 ,2 ]
机构
[1] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Evolutionary reinforcement learning; Reinforcement learning; Evolutionary algorithms; Imitation learning;
D O I
10.1016/j.ins.2020.12.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning, evolutionary algorithms and imitation learning are three principal methods to deal with continuous control tasks. Reinforcement learning is sample efficient, yet sensitive to hyperparameters settings and needs efficient exploration; Evolutionary algorithms are stable, but with low sample efficiency; Imitation learning is both sample efficient and stable, however it requires the guidance of expert data. In this paper, we propose Recruitment-imitation Mechanism (RIM) for evolutionary reinforcement learning, a scalable framework that combines advantages of the three methods mentioned above. The core of this framework is a dual-actors and single critic reinforcement learning agent. This agent can recruit high-fitness actors from the population performing evolutionary algorithms, which instructs itself to learn from experience replay buffer. At the same time, low-fitness actors in the evolutionary population can imitate behavior patterns of the reinforcement learning agent and promote their fitness level. Reinforcement and imitation learners in this framework can be replaced with any off-policy actor-critic reinforcement learner and data-driven imitation learner. We evaluate RIM on a series of benchmarks for continuous control tasks in Mujoco. The experimental results show that RIM outperforms prior evolutionary or reinforcement learning methods. The performance of RIM's components is significantly better than components of previous evolutionary reinforcement learning algorithm, and the recruitment using soft update enables reinforcement learning agent to learn faster than that using hard update. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:172 / 188
页数:17
相关论文
共 50 条
  • [41] Evolutionary Algorithms for Reinforcement Learning
    Moriarty, David E.
    Schultz, Alan C.
    Grefenstette, John J.
    Journal of Artificial Intelligence Research, 1999, 11 (00): : 241 - 276
  • [42] Deep imitation reinforcement learning with expert demonstration data
    Yi, Menglong
    Xu, Xin
    Zeng, Yujun
    Jung, Seul
    JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1567 - 1573
  • [43] Generating stable molecules using imitation and reinforcement learning
    Meldgaard, Soren Ager
    Koehler, Jonas
    Mortensen, Henrik Lund
    Christiansen, Mads-Peter, V
    Noe, Frank
    Hammer, Bjork
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (01):
  • [44] VICARIOUS REINFORCEMENT AND MODELS BEHAVIOR IN VERBAL LEARNING AND IMITATION
    KAPLAN, KJ
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1972, 95 (02): : 448 - &
  • [45] Effective integration of imitation learning and reinforcement learning by generating internal reward
    Hamahata, Keita
    Taniguchi, Tadahiro
    Sakakibara, Kazutoshi
    Nishikawa, Ikuko
    Tabuchi, Kazuma
    Sawaragi, Tetsuo
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 3, PROCEEDINGS, 2008, : 121 - +
  • [46] Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules
    Uchibe, Eiji
    FRONTIERS IN NEUROROBOTICS, 2018, 12
  • [47] Learning through Imitation and Reinforcement Learning: Toward the Acquisition of Painting Motions
    Sakato, Tatsuya
    Ozeki, Motoyuki
    Oka, Natsuki
    2014 IIAI 3RD INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2014), 2014, : 873 - 880
  • [48] Addressing Delays in Reinforcement Learning via Delayed Adversarial Imitation Learning
    Xie, Minzhi
    Xia, Bo
    Yu, Yalou
    Wang, Xueqian
    Chang, Yongzhe
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT III, 2023, 14256 : 271 - 282
  • [49] Personalized Dynamic Difficulty Adjustment - Imitation Learning Meets Reinforcement Learning
    Fuchs, Ronja
    Gieseke, Robin
    Dockhorn, Alexander
    2024 IEEE CONFERENCE ON GAMES, COG 2024, 2024,
  • [50] Robotics in construction: A critical review of the reinforcement learning and imitation learning paradigms
    Delgado, Juan Manuel Davila
    Oyedele, Lukumon
    ADVANCED ENGINEERING INFORMATICS, 2022, 54