Dynamic Modeling for Reinforcement Learning with Random Delay

被引：0

作者：

Yu, Yalou ^{[1
]}

Xia, Bo ^{[1
]}

Xie, Minzhi ^{[1
]}

Li, Zhiheng ^{[1
]}

Wang, Xuwqian ^{[1
]}

机构：

[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 100084, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT IV | 2024年 / 15019卷

关键词：

Reinforcement Learning; Delayed Environment; Dynamic Modeling; ROBOT;

D O I：

10.1007/978-3-031-72341-4_26

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Delays in real-world tasks degrade the performance of standard reinforcement learning (RL) which is based on the assumption that environmental feedback and action selection are instantaneous. Many approaches in RL community have been proposed to solve the problem caused by observation delay or action delay. However, previous methods suffer from inaccurate state predictions in consideration of accumulation error, limited tasks with specific action space or more complicated random delays situation. Motivated by the goal to solve those problems, in this paper, we propose a new algorithm named Prediction model with Arbitrary Delay (PAD) which aims at predicting delayed states more accurately through a gated unit for better decision making, especially in environments with random delays. Specifically, the proposed method tremendously alleviates the cumulative errors by using multi-step prediction model and could be applied to different kinds of tasks in virtue of the unique model structure. Experiments on continuous and discrete control tasks demonstrate that PAD achieves higher performance than the state-of-the-art methods in solving delays in RL.

引用

页码：381 / 396

页数：16

共 50 条

[41] Abolishing the effect of reinforcement delay on human causal learning
Buehner, MJ
May, J
QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION B-COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 2004, 57 (02): : 179 - 191
[42] Reinforcement Learning to Improve QoS and Minimizing Delay in IoT
Subramaniam, Mahendrakumar
Vedanarayanan, V.
Mubarakali, Azath
Priya, S. Sathiya
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02): : 1603 - 1612
[43] Reinforcement learning based routing in delay tolerant networks
Rezaei, Parisa
Derakhshanfard, Nahideh
WIRELESS NETWORKS, 2025, 31 (03) : 2909 - 2923
[44] Reinforcement Learning for Minimizing Communication Delay in Edge Computing
Rajashekar, Kolichala
2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 1270 - 1271
[45] Modeling for thermoelastic properties of advanced nanocomposites of random reinforcement
Buryachenko, VA
Roy, A
MATERIALS PROCESSING AND DESIGN: MODELING, SIMULATION AND APPLICATIONS, PTS 1 AND 2, 2004, 712 : 117 - 122
[46] A Dynamic Trust Model for Underwater Sensor Networks Fusing Deep Reinforcement Learning and Random Forest Algorithm
Wang, Beibei
Yue, Xiufang
Liu, Yonglei
Hao, Kun
Li, Zhisheng
Zhao, Xiaofang
APPLIED SCIENCES-BASEL, 2024, 14 (08):
[47] The modeling of random Brillouin dynamic grating
Song, Yingying
Li, Shichuan
Zhang, Jianzhong
Zhang, Mingjiang
Qiao, Lijun
Wang, Tao
OPTICAL FIBER TECHNOLOGY, 2019, 53
[48] Reinforcement learning application for dynamic trust modeling in large-scale open distributed systems
Li, Xiaoyong
Gui, Xiaolin
Zhao, Juan
Zhao, Bo
Journal of Computational Information Systems, 2008, 4 (06): : 2591 - 2597
[49] Dynamic modeling and control of pneumatic artificial muscles via Deep Lagrangian Networks and Reinforcement Learning
Wang, Shuopeng
Wang, Rixin
Liu, Yanhui
Zhang, Ying
Hao, Lina
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148
[50] Learning to Control Random Boolean Networks: A Deep Reinforcement Learning Approach
Papagiannis, Georgios
Moschoyiannis, Sotiris
COMPLEX NETWORKS AND THEIR APPLICATIONS VIII, VOL 1, 2020, 881 : 721 - 734

← 1 2 3 4 5 →