A Controllable Agent by Subgoals in Path Planning Using Goal-Conditioned Reinforcement Learning

被引：4

作者：

Lee, Gyeong Taek ^{[1
,2
]}

Kim, Kangjin ^{[3
,4
]}

机构：

[1] State Univ New Jersey, Rutgers Univ, Dept Ind & Syst Engn, Piscataway, NJ 08854 USA

[2] AImtory, Seoul 06249, South Korea

[3] Brigham & Womens Hosp, Dept Med, Channing Div Network Med, Boston, MA 02115 USA

[4] Harvard Med Sch, Boston, MA 02115 USA

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Trajectory; Training; Behavioral sciences; Robots; Reinforcement learning; Task analysis; Memory; Controllable agent; path planning; goal-conditioned reinforcement learning; bidirectional memory editing; MEMORY; ENVIRONMENTS;

D O I：

10.1109/ACCESS.2023.3264264

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The aim of path planning is to search for a path from the starting point to the goal. Numerous studies, however, have dealt with a single predefined goal. That is, an agent who has completed learning cannot reach other goals that have not been visited in the training. In the present study, we propose a novel reinforcement learning (RL) framework for an agent reachable to any subgoal as well as the final goal in path planning. To do this, we utilize goal-conditioned RL and propose bidirectional memory editing to obtain various bidirectional trajectories of the agent. Bidirectional memory editing can generate various behavior and subgoals of the agent from the limited trajectory. Then, the generated subgoals and behaviors of the agent are trained on the policy network so that the agent can reach any subgoals from any starting point. In addition, we present reward shaping for the short path of the agent to reach the goal. In the experimental result, the agent was able to reach the various goals that had never been visited by the agent during the training. We confirmed that the agent could perform difficult missions, such as a round trip, and the agent used the shorter route with reward shaping.

引用

页码：33812 / 33825

页数：14

共 50 条

[21] Goal-conditioned Imitation Learning
Ding, Yiming
Florensa, Carlos
Phielipp, Mariano
Abbeel, Pieter
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[22] Learning Efficient Representations for Goal-conditioned Reinforcement Learning via Tabu Search
Liang, Tianhao
Chen, Tianyang
Chen, Xianwei
Ren, Qinyuan
2024 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, CIS AND IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS, RAM, CIS-RAM 2024, 2024, : 328 - 333
[23] Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning
Liu, Bo
Feng, Yihao
Liu, Qiang
Stone, Peter
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8799 - 8806
[24] Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning
Liu, Jinxin
Wang, Donglin
Tian, Qiangxing
Chen, Zhengyu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7558 - 7566
[25] Goal-conditioned offline reinforcement learning through state space partitioning
Wang, Mianchu
Jin, Yue
Montana, Giovanni
MACHINE LEARNING, 2024, 113 (05) : 2435 - 2465
[26] Highly valued subgoal generation for efficient goal-conditioned reinforcement learning
Li, Yao
Wang, YuHui
Tan, XiaoYang
NEURAL NETWORKS, 2025, 181
[27] Goal-conditioned offline reinforcement learning through state space partitioning
Mianchu Wang
Yue Jin
Giovanni Montana
Machine Learning, 2024, 113 : 2435 - 2465
[28] Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives
Qiu, Wenjie
Mao, Wensen
Zhu, He
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[29] Goal-Conditioned Hierarchical Reinforcement Learning With High-Level Model Approximation
Luo, Yu
Ji, Tianying
Sun, Fuchun
Liu, Huaping
Zhang, Jianwei
Jing, Mingxuan
Huang, Wenbing
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2705 - 2719
[30] Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement Learning
Hongyu Ding
Yuanze Tang
Qing Wu
Bo Wang
Chunlin Chen
Zhi Wang
IEEE/CAAJournalofAutomaticaSinica, 2023, 10 (12) : 2233 - 2247

← 1 2 3 4 5 →