Improving Offline Reinforcement Learning with Inaccurate Simulators

被引:0
|
作者
Hou, Yiwen [1 ]
Sun, Haoyuan [1 ]
Ma, Jinming [1 ]
Wu, Feng [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICRA57147.2024.10610833
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Offline reinforcement learning (RL) provides a promising approach to avoid costly online interaction with the real environment. However, the performance of offline RL highly depends on the quality of the datasets, which may cause extrapolation error in the learning process. In many robotic applications, an inaccurate simulator is often available. However, the data directly collected from the inaccurate simulator cannot be directly used in offline RL due to the well-known exploration-exploitation dilemma and the dynamic gap between inaccurate simulation and the real environment. To address these issues, we propose a novel approach to combine the offline dataset and the inaccurate simulation data in a better manner. Specifically, we pre-train a generative adversarial network (GAN) model to fit the state distribution of the offline dataset. Given this, we collect data from the inaccurate simulator starting from the distribution provided by the generator and reweight the simulated data using the discriminator. Our experimental results in the D4RL benchmark and a real-world manipulation task confirm that our method can benefit more from both inaccurate simulator and limited offline datasets to achieve better performance than the state-of-the-art methods.
引用
收藏
页码:5162 / 5168
页数:7
相关论文
共 50 条
  • [1] Selective Data Augmentation for Improving the Performance of Offline Reinforcement Learning
    Han, Jungwoo
    Kim, Jinwhan
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 222 - 226
  • [2] PerSim: Data-efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators
    Agarwal, Anish
    Alomar, Abdullah
    Alumootil, Varkey
    Shah, Devavrat
    Shen, Dennis
    Xu, Zhi
    Yang, Cindy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Offline Reinforcement Learning with Pseudometric Learning
    Dadashi, Robert
    Rezaeifar, Shideh
    Vieillard, Nino
    Hussenot, Leonard
    Pietquin, Olivier
    Geist, Matthieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Interactive Reinforcement Learning with Inaccurate Feedback
    Faulkner, Thylor A. Kessler
    Short, Elaine Schaertl
    Thomaz, Andrea L.
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 7498 - 7504
  • [5] Benchmarking Offline Reinforcement Learning
    Tittaferrante, Andrew
    Yassine, Abdulsalam
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 259 - 263
  • [6] Federated Offline Reinforcement Learning
    Zhou, Doudou
    Zhang, Yufeng
    Sonabend-W, Aaron
    Wang, Zhaoran
    Lu, Junwei
    Cai, Tianxi
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 3152 - 3163
  • [7] Distributed Offline Reinforcement Learning
    Heredia, Paulo
    George, Jemin
    Mou, Shaoshuai
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4621 - 4626
  • [8] ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
    Zhao, Kai
    Hao, Jianye
    Ma, Yi
    Liu, Jinyi
    Zheng, Yan
    Meng, Zhaopeng
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5563 - 5571
  • [9] Improving Offline Reinforcement Learning With In-Sample Advantage Regularization for Robot Manipulation
    Ma, Chengzhong
    Yang, Deyu
    Wu, Tianyu
    Liu, Zeyang
    Yang, Houxue
    Chen, Xingyu
    Lan, Xuguang
    Zheng, Nanning
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [10] Learning Behavior of Offline Reinforcement Learning Agents
    Shukla, Indu
    Dozier, Haley. R.
    Henslee, Althea. C.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051