Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning

被引:9
|
作者
Li, Ruijia [1 ]
Cai, Zhiling [1 ]
Huang, Tianyi [1 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical reinforcement learning; Reinforcement learning; Continuous control; Intrinsic motivation;
D O I
10.1016/j.knosys.2021.107128
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical reinforcement learning (HRL) extends traditional reinforcement learning methods to complex tasks, such as the continuous control task with long horizon. As an effective paradigm for HRL, the subgoal-based HRL method uses subgoals to provide intrinsic motivation which helps the agent to reach the desired goal. However, it is tough to determine the subgoal. In this paper, we present a new concept called anchor to replace the subgoal. Our anchor is selected from the achieved goals of the agent. By the anchor, we propose a new HRL method which encourages the agent to move fast away from the corresponding anchor in the right direction of reaching the desired goal. Specifically, for moving fast, our new method uses an intrinsic reward computed by the distance between the current achieved goal and the corresponding anchor. Meanwhile, for moving in the right direction, it weights the intrinsic reward by the extrinsic rewards collected in the process of moving away from the corresponding anchor. The experiments demonstrate the effectiveness of the proposed method on the continuous control task with long horizon. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning
    Li, Jinning
    Tang, Chen
    Tomizuka, Masayoshi
    Zhan, Wei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 10216 - 10223
  • [22] Graph Enhanced Hierarchical Reinforcement Learning for Goal-oriented Learning Path Recommendation
    Li, Qingyao
    Xia, Wei
    Yin, Li'ang
    Shen, Jian
    Rui, Renting
    Zhang, Weinan
    Chen, Xianyu
    Tang, Ruiming
    Yu, Yong
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1318 - 1327
  • [23] Subgoal identification for reinforcement learning and planning in multiagent problem solving
    Chiu, Chung-Cheng
    Soo, Von-Wun
    MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2007, 4687 : 37 - +
  • [24] Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation
    Yan, Jiangyue
    Luo, Biao
    Xu, Xiaodong
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)
  • [25] Reinforcement learning transfer based on subgoal discovery and subtask similarity
    Wang, Hao
    Fan, Shunguo
    Song, Jinhua
    Gao, Yang
    Chen, Xingguo
    IEEE/CAA Journal of Automatica Sinica, 2014, 1 (03) : 257 - 266
  • [26] Reinforcement Learning Transfer Based on Subgoal Discovery and Subtask Similarity
    Hao Wang
    Shunguo Fan
    Jinhua Song
    Yang Gao
    Xingguo Chen
    IEEE/CAAJournalofAutomaticaSinica, 2014, 1 (03) : 257 - 266
  • [27] Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
    Zeng, Junjie
    Qin, Long
    Hu, Yue
    Hu, Cong
    Yin, Quanjun
    APPLIED SCIENCES-BASEL, 2019, 9 (02):
  • [28] Subgoal Discovery in Reinforcement Learning Using Local Graph Clustering
    Entezari, Negin
    Shiri, Mohammad Ebrahim
    Moradi, Parham
    INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2011, 4 (03): : 13 - 23
  • [29] Hierarchical Imitation Learning via Subgoal Representation Learning for Dynamic Treatment Recommendation
    Wang, Lu
    Tang, Ruiming
    He, Xiaofeng
    He, Xiuqiang
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1081 - 1089
  • [30] FTPSG: Feature mixture transformer and potential-based subgoal generation for hierarchical multi-agent reinforcement learning
    Nicholaus, Isack Thomas
    Kang, Dae-Ki
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 270