Distributionally Robust Model-based Reinforcement Learning with Large State Spaces

被引:0
|
作者
Ramesh, Shyam Sundhar [1 ]
Sessa, Pier Giuseppe [2 ]
Hu, Yifan [3 ]
Krause, Andreas [2 ]
Bogunovic, Ilija [1 ]
机构
[1] UCL, London, England
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
基金
英国工程与自然科学研究理事会;
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. To overcome these issues, we study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics, leveraging access to a generative model (i.e., simulator). We further demonstrate the statistical sample complexity of the proposed method for different uncertainty sets. These complexity bounds are independent of the number of states and extend beyond linear dynamics, ensuring the effectiveness of our approach in identifying near-optimal distributionally-robust policies. The proposed method can be further combined with other model-free distributionally robust reinforcement learning methods to obtain a near-optimal robust policy. Experimental results demonstrate the robustness of our algorithm to distributional shifts and its superior performance in terms of the number of samples needed.
引用
收藏
页数:42
相关论文
共 50 条
  • [11] Model-based Reinforcement Learning: A Survey
    Moerland, Thomas M.
    Broekens, Joost
    Plaat, Aske
    Jonker, Catholijn M.
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2023, 16 (01): : 1 - 118
  • [12] A survey on model-based reinforcement learning
    Fan-Ming LUO
    Tian XU
    Hang LAI
    Xiong-Hui CHEN
    Weinan ZHANG
    Yang YU
    Science China(Information Sciences), 2024, 67 (02) : 59 - 84
  • [13] Nonparametric model-based reinforcement learning
    Atkeson, CG
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1008 - 1014
  • [14] The ubiquity of model-based reinforcement learning
    Doll, Bradley B.
    Simon, Dylan A.
    Daw, Nathaniel D.
    CURRENT OPINION IN NEUROBIOLOGY, 2012, 22 (06) : 1075 - 1081
  • [15] Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
    Curi, Sebastian
    Bogunovic, Ilija
    Krause, Andreas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [16] Multiple model-based reinforcement learning
    Doya, K
    Samejima, K
    Katagiri, K
    Kawato, M
    NEURAL COMPUTATION, 2002, 14 (06) : 1347 - 1369
  • [17] RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning
    Rigter, Marc
    Lacerda, Bruno
    Hawes, Nick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [18] Estimating Lyapunov Region of Attraction for Robust Model-Based Reinforcement Learning USV
    Xia, Lei
    Cui, Yunduan
    Yi, Zhengkun
    Li, Huiyun
    Wu, Xinyu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 8898 - 8911
  • [19] A survey on model-based reinforcement learning
    Luo, Fan-Ming
    Xu, Tian
    Lai, Hang
    Chen, Xiong-Hui
    Zhang, Weinan
    Yu, Yang
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (02)
  • [20] Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation
    Corneil, Dane
    Gerstner, Wulfram
    Brea, Johanni
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80