Anchor: The achieved goal to replace the subgoal for hierarchical reinforcement learning

被引:9
|
作者
Li, Ruijia [1 ]
Cai, Zhiling [1 ]
Huang, Tianyi [1 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Hierarchical reinforcement learning; Reinforcement learning; Continuous control; Intrinsic motivation;
D O I
10.1016/j.knosys.2021.107128
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical reinforcement learning (HRL) extends traditional reinforcement learning methods to complex tasks, such as the continuous control task with long horizon. As an effective paradigm for HRL, the subgoal-based HRL method uses subgoals to provide intrinsic motivation which helps the agent to reach the desired goal. However, it is tough to determine the subgoal. In this paper, we present a new concept called anchor to replace the subgoal. Our anchor is selected from the achieved goals of the agent. By the anchor, we propose a new HRL method which encourages the agent to move fast away from the corresponding anchor in the right direction of reaching the desired goal. Specifically, for moving fast, our new method uses an intrinsic reward computed by the distance between the current achieved goal and the corresponding anchor. Meanwhile, for moving in the right direction, it weights the intrinsic reward by the extrinsic rewards collected in the process of moving away from the corresponding anchor. The experiments demonstrate the effectiveness of the proposed method on the continuous control task with long horizon. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Goal Recognition as Reinforcement Learning
    Amado, Leonardo
    Mirsky, Reuth
    Meneguzzi, Felipe
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9644 - 9651
  • [32] Selecting Subgoal for Social AGV Path Planning by Using Reinforcement Learning
    Wu, Cheng-En
    Tsai, Hsiao-Ping
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 452 - 457
  • [33] Human-Interactive Subgoal Supervision for Efficient Inverse Reinforcement Learning
    Pan, Xinlei
    Shen, Yilin
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 1380 - 1387
  • [34] Guide to Control: Offline Hierarchical Reinforcement Learning Using Subgoal Generation for Long-Horizon and Sparse-Reward Tasks
    Shin, Wonchul
    Kim, Yusung
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4217 - 4225
  • [35] Selecting Subgoal for Social AGV Path Planning by Using Reinforcement Learning
    Wu, Cheng-En
    Tsai, Hsiao-Ping
    Proceedings - IEEE International Conference on Mobile Data Management, 2022, 2022-June : 452 - 457
  • [36] Goal-Conditioned Hierarchical Reinforcement Learning With High-Level Model Approximation
    Luo, Yu
    Ji, Tianying
    Sun, Fuchun
    Liu, Huaping
    Zhang, Jianwei
    Jing, Mingxuan
    Huang, Wenbing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2705 - 2719
  • [37] Subgoal-Based Reward Shaping to Improve Efficiency in Reinforcement Learning
    Okudo, Takato
    Yamada, Seiji
    IEEE ACCESS, 2021, 9 : 97557 - 97568
  • [38] Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis
    Zadem, Mehdi
    Mover, Sergio
    Nguyen, Sao Mai
    2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL, 2023, : 423 - 428
  • [39] GHGC: Goal-based Hierarchical Group Communication in Multi-Agent Reinforcement Learning
    Jiang, Hao
    Shi, Dianxi
    Xue, Chao
    Wang, Yajie
    Wang, Gongju
    Zhang, Yongjun
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3507 - 3514
  • [40] Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling
    Sosic, Adrian
    Zoubir, Abdelhak M.
    Rueckert, Elmar
    Peters, Jan
    Koeppl, Heinz
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19