Dynamic Modeling for Reinforcement Learning with Random Delay

被引:0
|
作者
Yu, Yalou [1 ]
Xia, Bo [1 ]
Xie, Minzhi [1 ]
Li, Zhiheng [1 ]
Wang, Xuwqian [1 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 100084, Peoples R China
关键词
Reinforcement Learning; Delayed Environment; Dynamic Modeling; ROBOT;
D O I
10.1007/978-3-031-72341-4_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Delays in real-world tasks degrade the performance of standard reinforcement learning (RL) which is based on the assumption that environmental feedback and action selection are instantaneous. Many approaches in RL community have been proposed to solve the problem caused by observation delay or action delay. However, previous methods suffer from inaccurate state predictions in consideration of accumulation error, limited tasks with specific action space or more complicated random delays situation. Motivated by the goal to solve those problems, in this paper, we propose a new algorithm named Prediction model with Arbitrary Delay (PAD) which aims at predicting delayed states more accurately through a gated unit for better decision making, especially in environments with random delays. Specifically, the proposed method tremendously alleviates the cumulative errors by using multi-step prediction model and could be applied to different kinds of tasks in virtue of the unique model structure. Experiments on continuous and discrete control tasks demonstrate that PAD achieves higher performance than the state-of-the-art methods in solving delays in RL.
引用
收藏
页码:381 / 396
页数:16
相关论文
共 50 条
  • [21] RESTRICTED RANDOM REINFORCEMENT IN PROBABILITY-LEARNING
    MADISON, HL
    JOHNS, MD
    PSYCHOLOGICAL REPORTS, 1965, 16 (03) : 733 - 736
  • [22] Deep Reinforcement Learning with the Random Neural Network
    Serrano, Will
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [23] A kernel-based reinforcement learning approach to dynamic behavior modeling of intrusion detection
    Xu, Xin
    Luo, Yirong
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 455 - +
  • [24] Delay-aware dynamic access control for mMTC in wireless networks using deep reinforcement learning
    Pacheco-Paramo, Diego
    Tello-Oquendo, Luis
    COMPUTER NETWORKS, 2020, 182 (182)
  • [25] Dynamic Pricing by Multiagent Reinforcement Learning
    Han, Wei
    Liu, Lingbo
    Zheng, Huaili
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 226 - 229
  • [26] Reinforcement Learning for Dynamic Microfluidic Control
    Dressler, Oliver J.
    Howes, Philip D.
    Choo, Jaebum
    deMello, Andrew J.
    ACS OMEGA, 2018, 3 (08): : 10084 - 10091
  • [27] Reinforcement Learning for Fair Dynamic Pricing
    Maestre, Roberto
    Duque, Juan
    Rubio, Alberto
    Arevalo, Juan
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 120 - 135
  • [28] Dynamic action sequences in reinforcement learning
    Moren, J
    FROM ANIMALS TO ANIMATS 5, 1998, : 366 - 371
  • [29] Dynamic Ensemble Selection with Reinforcement Learning
    Liu, Lihua
    Wu, Jibing
    Li, Xuan
    Huang, Hongbin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 629 - 640
  • [30] Dynamic Colocation Policies with Reinforcement Learning
    Li, Yuhao
    Sun, Dan
    Lee, Benjamin C.
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2020, 17 (01)