Improving Offline Reinforcement Learning with Inaccurate Simulators

被引：0

作者：

Hou, Yiwen ^{[1
]}

Sun, Haoyuan ^{[1
]}

Ma, Jinming ^{[1
]}

Wu, Feng ^{[1
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Anhui, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICRA57147.2024.10610833

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Offline reinforcement learning (RL) provides a promising approach to avoid costly online interaction with the real environment. However, the performance of offline RL highly depends on the quality of the datasets, which may cause extrapolation error in the learning process. In many robotic applications, an inaccurate simulator is often available. However, the data directly collected from the inaccurate simulator cannot be directly used in offline RL due to the well-known exploration-exploitation dilemma and the dynamic gap between inaccurate simulation and the real environment. To address these issues, we propose a novel approach to combine the offline dataset and the inaccurate simulation data in a better manner. Specifically, we pre-train a generative adversarial network (GAN) model to fit the state distribution of the offline dataset. Given this, we collect data from the inaccurate simulator starting from the distribution provided by the generator and reweight the simulated data using the discriminator. Our experimental results in the D4RL benchmark and a real-world manipulation task confirm that our method can benefit more from both inaccurate simulator and limited offline datasets to achieve better performance than the state-of-the-art methods.

引用

页码：5162 / 5168

页数：7

共 50 条

[1] Selective Data Augmentation for Improving the Performance of Offline Reinforcement Learning
Han, Jungwoo
Kim, Jinwhan
2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 222 - 226
[2] PerSim: Data-efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators
Agarwal, Anish
Alomar, Abdullah
Alumootil, Varkey
Shah, Devavrat
Shen, Dennis
Xu, Zhi
Yang, Cindy
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[3] Offline Reinforcement Learning with Pseudometric Learning
Dadashi, Robert
Rezaeifar, Shideh
Vieillard, Nino
Hussenot, Leonard
Pietquin, Olivier
Geist, Matthieu
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] Interactive Reinforcement Learning with Inaccurate Feedback
Faulkner, Thylor A. Kessler
Short, Elaine Schaertl
Thomaz, Andrea L.
2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 7498 - 7504
[5] Benchmarking Offline Reinforcement Learning
Tittaferrante, Andrew
Yassine, Abdulsalam
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 259 - 263
[6] Federated Offline Reinforcement Learning
Zhou, Doudou
Zhang, Yufeng
Sonabend-W, Aaron
Wang, Zhaoran
Lu, Junwei
Cai, Tianxi
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 3152 - 3163
[7] Distributed Offline Reinforcement Learning
Heredia, Paulo
George, Jemin
Mou, Shaoshuai
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4621 - 4626
[8] ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
Zhao, Kai
Hao, Jianye
Ma, Yi
Liu, Jinyi
Zheng, Yan
Meng, Zhaopeng
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5563 - 5571
[9] Improving Offline Reinforcement Learning With In-Sample Advantage Regularization for Robot Manipulation
Ma, Chengzhong
Yang, Deyu
Wu, Tianyu
Liu, Zeyang
Yang, Houxue
Chen, Xingyu
Lan, Xuguang
Zheng, Nanning
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[10] Learning Behavior of Offline Reinforcement Learning Agents
Shukla, Indu
Dozier, Haley. R.
Henslee, Althea. C.
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051

← 1 2 3 4 5 →