Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning

被引:9
|
作者
Li, Ruijia [1 ]
Cai, Zhiling [1 ]
Huang, Tianyi [1 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical reinforcement learning; Reinforcement learning; Continuous control; Intrinsic motivation;
D O I
10.1016/j.knosys.2021.107128
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical reinforcement learning (HRL) extends traditional reinforcement learning methods to complex tasks, such as the continuous control task with long horizon. As an effective paradigm for HRL, the subgoal-based HRL method uses subgoals to provide intrinsic motivation which helps the agent to reach the desired goal. However, it is tough to determine the subgoal. In this paper, we present a new concept called anchor to replace the subgoal. Our anchor is selected from the achieved goals of the agent. By the anchor, we propose a new HRL method which encourages the agent to move fast away from the corresponding anchor in the right direction of reaching the desired goal. Specifically, for moving fast, our new method uses an intrinsic reward computed by the distance between the current achieved goal and the corresponding anchor. Meanwhile, for moving in the right direction, it weights the intrinsic reward by the extrinsic rewards collected in the process of moving away from the corresponding anchor. The experiments demonstrate the effectiveness of the proposed method on the continuous control task with long horizon. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Robot Subgoal-guided Navigation in Dynamic Crowded Environments with Hierarchical Deep Reinforcement Learning
    Tianle Zhang
    Zhen Liu
    Zhiqiang Pu
    Jianqiang Yi
    Yanyan Liang
    Du Zhang
    International Journal of Control, Automation and Systems, 2023, 21 : 2350 - 2362
  • [12] Interpretable Reinforcement Learning with Multilevel Subgoal Discovery
    Demin, Alexander
    Ponomaryov, Denis
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 251 - 258
  • [13] Induction and exploitation of subgoal automata for reinforcement learning
    Furelos-Blanco D.
    Law M.
    Jonsson A.
    Broda K.
    Russo A.
    Journal of Artificial Intelligence Research, 2021, 70 : 1031 - 1116
  • [14] Autonomous Reinforcement Learning via Subgoal Curricula
    Sharma, Archit
    Gupta, Abhishek
    Levine, Sergey
    Hausman, Karol
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [15] GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning
    Liu, Jianfeng
    Pan, Feiyang
    Luo, Ling
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1793 - 1796
  • [16] Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification
    Liu, Chenghao
    Zhu, Fei
    Liu, Quan
    Fu, Yuchen
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (10) : 1686 - 1696
  • [17] Sample Complexity of Goal-Conditioned Hierarchical Reinforcement Learning
    Robert, Arnaud
    Pike-Burke, Ciara
    Faisal, A. Aldo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [18] Hierarchical Reinforcement Learning With Automatic Sub-Goal Identification
    Chenghao Liu
    Fei Zhu
    Quan Liu
    Yuchen Fu
    IEEE/CAA Journal of Automatica Sinica, 2021, 8 (10) : 1686 - 1696
  • [19] Hierarchical reinforcement learning from imperfect demonstrations through reachable coverage-based subgoal filtering
    Tang, Yu
    Guo, Shangqi
    Liu, Jinhui
    Wan, Bo
    An, Lingling
    Liu, Jian K.
    KNOWLEDGE-BASED SYSTEMS, 2024, 294
  • [20] Reinforcement learning acceleration through autonomous subgoal discovery
    Asadi, M
    Huber, M
    MLMTA '05: Proceedings of the International Conference on Machine Learning Models Technologies and Applications, 2005, : 69 - 74