Dynamic Modeling for Reinforcement Learning with Random Delay

被引:0
|
作者
Yu, Yalou [1 ]
Xia, Bo [1 ]
Xie, Minzhi [1 ]
Li, Zhiheng [1 ]
Wang, Xuwqian [1 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 100084, Peoples R China
关键词
Reinforcement Learning; Delayed Environment; Dynamic Modeling; ROBOT;
D O I
10.1007/978-3-031-72341-4_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Delays in real-world tasks degrade the performance of standard reinforcement learning (RL) which is based on the assumption that environmental feedback and action selection are instantaneous. Many approaches in RL community have been proposed to solve the problem caused by observation delay or action delay. However, previous methods suffer from inaccurate state predictions in consideration of accumulation error, limited tasks with specific action space or more complicated random delays situation. Motivated by the goal to solve those problems, in this paper, we propose a new algorithm named Prediction model with Arbitrary Delay (PAD) which aims at predicting delayed states more accurately through a gated unit for better decision making, especially in environments with random delays. Specifically, the proposed method tremendously alleviates the cumulative errors by using multi-step prediction model and could be applied to different kinds of tasks in virtue of the unique model structure. Experiments on continuous and discrete control tasks demonstrate that PAD achieves higher performance than the state-of-the-art methods in solving delays in RL.
引用
收藏
页码:381 / 396
页数:16
相关论文
共 50 条
  • [41] Abolishing the effect of reinforcement delay on human causal learning
    Buehner, MJ
    May, J
    QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY SECTION B-COMPARATIVE AND PHYSIOLOGICAL PSYCHOLOGY, 2004, 57 (02): : 179 - 191
  • [42] Reinforcement Learning to Improve QoS and Minimizing Delay in IoT
    Subramaniam, Mahendrakumar
    Vedanarayanan, V.
    Mubarakali, Azath
    Priya, S. Sathiya
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02): : 1603 - 1612
  • [43] Reinforcement learning based routing in delay tolerant networks
    Rezaei, Parisa
    Derakhshanfard, Nahideh
    WIRELESS NETWORKS, 2025, 31 (03) : 2909 - 2923
  • [44] Reinforcement Learning for Minimizing Communication Delay in Edge Computing
    Rajashekar, Kolichala
    2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 1270 - 1271
  • [45] Modeling for thermoelastic properties of advanced nanocomposites of random reinforcement
    Buryachenko, VA
    Roy, A
    MATERIALS PROCESSING AND DESIGN: MODELING, SIMULATION AND APPLICATIONS, PTS 1 AND 2, 2004, 712 : 117 - 122
  • [46] A Dynamic Trust Model for Underwater Sensor Networks Fusing Deep Reinforcement Learning and Random Forest Algorithm
    Wang, Beibei
    Yue, Xiufang
    Liu, Yonglei
    Hao, Kun
    Li, Zhisheng
    Zhao, Xiaofang
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [47] The modeling of random Brillouin dynamic grating
    Song, Yingying
    Li, Shichuan
    Zhang, Jianzhong
    Zhang, Mingjiang
    Qiao, Lijun
    Wang, Tao
    OPTICAL FIBER TECHNOLOGY, 2019, 53
  • [48] Reinforcement learning application for dynamic trust modeling in large-scale open distributed systems
    Li, Xiaoyong
    Gui, Xiaolin
    Zhao, Juan
    Zhao, Bo
    Journal of Computational Information Systems, 2008, 4 (06): : 2591 - 2597
  • [49] Dynamic modeling and control of pneumatic artificial muscles via Deep Lagrangian Networks and Reinforcement Learning
    Wang, Shuopeng
    Wang, Rixin
    Liu, Yanhui
    Zhang, Ying
    Hao, Lina
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148
  • [50] Learning to Control Random Boolean Networks: A Deep Reinforcement Learning Approach
    Papagiannis, Georgios
    Moschoyiannis, Sotiris
    COMPLEX NETWORKS AND THEIR APPLICATIONS VIII, VOL 1, 2020, 881 : 721 - 734