Dynamic Modeling for Reinforcement Learning with Random Delay

被引：0

作者：

Yu, Yalou ^{[1
]}

Xia, Bo ^{[1
]}

Xie, Minzhi ^{[1
]}

Li, Zhiheng ^{[1
]}

Wang, Xuwqian ^{[1
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 100084, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT IV | 2024年 / 15019卷

关键词：

Reinforcement Learning; Delayed Environment; Dynamic Modeling; ROBOT;

D O I：

10.1007/978-3-031-72341-4_26

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Delays in real-world tasks degrade the performance of standard reinforcement learning (RL) which is based on the assumption that environmental feedback and action selection are instantaneous. Many approaches in RL community have been proposed to solve the problem caused by observation delay or action delay. However, previous methods suffer from inaccurate state predictions in consideration of accumulation error, limited tasks with specific action space or more complicated random delays situation. Motivated by the goal to solve those problems, in this paper, we propose a new algorithm named Prediction model with Arbitrary Delay (PAD) which aims at predicting delayed states more accurately through a gated unit for better decision making, especially in environments with random delays. Specifically, the proposed method tremendously alleviates the cumulative errors by using multi-step prediction model and could be applied to different kinds of tasks in virtue of the unique model structure. Experiments on continuous and discrete control tasks demonstrate that PAD achieves higher performance than the state-of-the-art methods in solving delays in RL.

引用

页码：381 / 396

页数：16

共 50 条

[21] RESTRICTED RANDOM REINFORCEMENT IN PROBABILITY-LEARNING
MADISON, HL
JOHNS, MD
PSYCHOLOGICAL REPORTS, 1965, 16 (03) : 733 - 736
[22] Deep Reinforcement Learning with the Random Neural Network
Serrano, Will
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
[23] A kernel-based reinforcement learning approach to dynamic behavior modeling of intrusion detection
Xu, Xin
Luo, Yirong
ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 455 - +
[24] Delay-aware dynamic access control for mMTC in wireless networks using deep reinforcement learning
Pacheco-Paramo, Diego
Tello-Oquendo, Luis
COMPUTER NETWORKS, 2020, 182 (182)
[25] Dynamic Pricing by Multiagent Reinforcement Learning
Han, Wei
Liu, Lingbo
Zheng, Huaili
PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 226 - 229
[26] Reinforcement Learning for Dynamic Microfluidic Control
Dressler, Oliver J.
Howes, Philip D.
Choo, Jaebum
deMello, Andrew J.
ACS OMEGA, 2018, 3 (08): : 10084 - 10091
[27] Reinforcement Learning for Fair Dynamic Pricing
Maestre, Roberto
Duque, Juan
Rubio, Alberto
Arevalo, Juan
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 120 - 135
[28] Dynamic action sequences in reinforcement learning
Moren, J
FROM ANIMALS TO ANIMATS 5, 1998, : 366 - 371
[29] Dynamic Ensemble Selection with Reinforcement Learning
Liu, Lihua
Wu, Jibing
Li, Xuan
Huang, Hongbin
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 629 - 640
[30] Dynamic Colocation Policies with Reinforcement Learning
Li, Yuhao
Sun, Dan
Lee, Benjamin C.
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2020, 17 (01)

← 1 2 3 4 5 →