Distributionally Robust Model-based Reinforcement Learning with Large State Spaces

被引:0
|
作者
Ramesh, Shyam Sundhar [1 ]
Sessa, Pier Giuseppe [2 ]
Hu, Yifan [3 ]
Krause, Andreas [2 ]
Bogunovic, Ilija [1 ]
机构
[1] UCL, London, England
[2] Swiss Fed Inst Technol, Zurich, Switzerland
[3] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
基金
英国工程与自然科学研究理事会;
关键词
MARKOV DECISION-PROCESSES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. To overcome these issues, we study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics, leveraging access to a generative model (i.e., simulator). We further demonstrate the statistical sample complexity of the proposed method for different uncertainty sets. These complexity bounds are independent of the number of states and extend beyond linear dynamics, ensuring the effectiveness of our approach in identifying near-optimal distributionally-robust policies. The proposed method can be further combined with other model-free distributionally robust reinforcement learning methods to obtain a near-optimal robust policy. Experimental results demonstrate the robustness of our algorithm to distributional shifts and its superior performance in terms of the number of samples needed.
引用
收藏
页数:42
相关论文
共 50 条
  • [31] Modeling Survival in model-based Reinforcement Learning
    Moazami, Saeed
    Doerschuk, Peggy
    2020 SECOND INTERNATIONAL CONFERENCE ON TRANSDISCIPLINARY AI (TRANSAI 2020), 2020, : 17 - 24
  • [32] Model-Based Reinforcement Learning With Isolated Imaginations
    Pan, Minting
    Zhu, Xiangming
    Zheng, Yitao
    Wang, Yunbo
    Yang, Xiaokang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2788 - 2803
  • [33] Model-based average reward reinforcement learning
    Tadepalli, P
    Ok, D
    ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 177 - 224
  • [34] Model-Based Reinforcement Learning in Robotics: A Survey
    Sun S.
    Lan X.
    Zhang H.
    Zheng N.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (01): : 1 - 16
  • [35] Continual Model-Based Reinforcement Learning with Hypernetworks
    Huang, Yizhou
    Xie, Kevin
    Bharadhwaj, Homanga
    Shkurti, Florian
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 799 - 805
  • [36] Adaptive Discretization for Model-Based Reinforcement Learning
    Sinclair, Sean R.
    Wang, Tianyu
    Jain, Gauri
    Banerjee, Siddhartha
    Yu, Christina Lee
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [37] A comparison of direct and model-based reinforcement learning
    Atkeson, CG
    Santamaria, JC
    1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 3557 - 3564
  • [38] Model-based Reinforcement Learning and the Eluder Dimension
    Osband, Ian
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [39] Model-based reinforcement learning in a complex domain
    Kalyanakrishnan, Shivaram
    Stone, Peter
    Liu, Yaxin
    ROBOCUP 2007: ROBOT SOCCER WORLD CUP XI, 2008, 5001 : 171 - 183
  • [40] Lipschitz Continuity in Model-based Reinforcement Learning
    Asadi, Kavosh
    Misra, Dipendra
    Littman, Michael L.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80