Dynamic Modeling for Reinforcement Learning with Random Delay

被引:0
|
作者
Yu, Yalou [1 ]
Xia, Bo [1 ]
Xie, Minzhi [1 ]
Li, Zhiheng [1 ]
Wang, Xuwqian [1 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 100084, Peoples R China
关键词
Reinforcement Learning; Delayed Environment; Dynamic Modeling; ROBOT;
D O I
10.1007/978-3-031-72341-4_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Delays in real-world tasks degrade the performance of standard reinforcement learning (RL) which is based on the assumption that environmental feedback and action selection are instantaneous. Many approaches in RL community have been proposed to solve the problem caused by observation delay or action delay. However, previous methods suffer from inaccurate state predictions in consideration of accumulation error, limited tasks with specific action space or more complicated random delays situation. Motivated by the goal to solve those problems, in this paper, we propose a new algorithm named Prediction model with Arbitrary Delay (PAD) which aims at predicting delayed states more accurately through a gated unit for better decision making, especially in environments with random delays. Specifically, the proposed method tremendously alleviates the cumulative errors by using multi-step prediction model and could be applied to different kinds of tasks in virtue of the unique model structure. Experiments on continuous and discrete control tasks demonstrate that PAD achieves higher performance than the state-of-the-art methods in solving delays in RL.
引用
收藏
页码:381 / 396
页数:16
相关论文
共 50 条
  • [1] Trajectory modeling via random utility inverse reinforcement learning
    Pitombeira-Neto, Anselmo R.
    Santos, Helano P.
    da Silva, Ticiana L. Coelho
    de Macedo, Jose Antonio F.
    INFORMATION SCIENCES, 2024, 660
  • [2] Deep Reinforcement Learning for Dynamic Berth Allocation with Random Ship Arrivals
    Zhou, Qianyu
    Wang, Peng
    Cao, Xiaohua
    2024 6TH INTERNATIONAL CONFERENCE ON DATA-DRIVEN OPTIMIZATION OF COMPLEX SYSTEMS, DOCS 2024, 2024, : 799 - 805
  • [3] Adapting Reinforcement Learning For Trust: Effective Modeling in Dynamic Environments
    Kafali, Oezguer
    Yolum, Pinar
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 383 - 386
  • [4] Dynamic Modeling of Heat Exchangers Based on Mechanism and Reinforcement Learning Synergy
    Sun, Hao
    Jia, Zile
    Zhao, Meng
    Tian, Jiayuan
    Liu, Dan
    Wang, Yifei
    BUILDINGS, 2024, 14 (03)
  • [5] Optimizing capacitive deionization operation using dynamic modeling and reinforcement learning
    Lee, Suin
    Shim, Jaegyu
    Kim, Hoo Hugo
    Yun, Nakyeong
    Son, Moon
    Cho, Kyung Hwa
    DESALINATION, 2025, 602
  • [6] DELAY OF REINFORCEMENT GRADIENTS IN CHILDRENS LEARNING
    WALTERS, RH
    PSYCHONOMIC SCIENCE, 1964, 1 (10): : 307 - 308
  • [7] DELAY OF REINFORCEMENT IN HUMAN VERBAL LEARNING
    SAMPSON, JF
    AUSTRALIAN JOURNAL OF PSYCHOLOGY, 1971, 23 (01) : 35 - &
  • [8] LUMINANCE AND REINFORCEMENT DELAY IN PROBABILITY LEARNING
    LAKOTA, RA
    MADISON, HL
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1971, 88 (02): : 277 - &
  • [9] Deep Reinforcement Learning Based Dynamic Routing Optimization for Delay-Sensitive Applications
    Chen, Jiawei
    Xiao, Yang
    Lin, Guocheng
    He, Gang
    Liu, Fang
    Zhou, Wenli
    Liu, Jun
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 5208 - 5213
  • [10] Control Delay in Reinforcement Learning for Real-Time Dynamic Systems: A Memoryless Approach
    Schuitema, Erik
    Busoniu, Lucian
    Babuska, Robert
    Jonker, Pieter
    IEEE/RSJ 2010 INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2010), 2010, : 3226 - 3231