Hyperparameter Tuning in Offline Reinforcement Learning

被引:0
|
作者
Tittaferrante, Andrew [1 ]
Yassine, Abdulsalam [2 ]
机构
[1] Lakehead Univ, Elect & Comp Engn, Thunder Bay, ON, Canada
[2] Lakehead Univ, Software Engn, Thunder Bay, ON, Canada
关键词
Deep Learning; Reinforcement Learning; Offline Reinforcement Learning;
D O I
10.1109/ICMLA55696.2022.00101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a reliable hyper-parameter tuning scheme for offline reinforcement learning. We demonstrate our proposed scheme using the simplest antmaze environment from the standard benchmark offline dataset, D4RL. The usual approach for policy evaluation in offline reinforcement learning involves online evaluation, i.e., cherry-picking best performance on the test environment. To mitigate this cherry-picking, we propose an ad-hoc online evaluation metric, which we name "median-median-return". This metric enables more reliable reporting of results because it represents the expected performance of the learned policy by taking the median online evaluation performance across both epochs and training runs. To demonstrate our scheme, we employ the recently state-of-the-art algorithm, IQL, and perform a thorough hyperparameter search based on our proposed metric. The tuned architectures enjoy notably stronger cherry-picked performance, and the best models are able to surpass the reported state-of-the-art performance on average.
引用
收藏
页码:585 / 590
页数:6
相关论文
共 50 条
  • [31] Offline reinforcement learning with task hierarchies
    Devin Schwab
    Soumya Ray
    Machine Learning, 2017, 106 : 1569 - 1598
  • [32] Offline Reinforcement Learning at Multiple Frequencies
    Burns, Kaylee
    Yu, Tianhe
    Finn, Chelsea
    Hausman, Karol
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2041 - 2051
  • [33] Survival Instinct in Offline Reinforcement Learning
    Li, Anqi
    Misra, Dipendra
    Kolobov, Andrey
    Cheng, Ching-An
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [34] A DATASET PERSPECTIVE ON OFFLINE REINFORCEMENT LEARNING
    Schweighofer, Kajetan
    Radler, Andreas
    Dinu, Marius-Constantin
    Hofmarcher, Markus
    Patil, Vihang
    Bitto-Nemling, Angela
    Eghbal-zadeh, Hamid
    Hochreiter, Sepp
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
  • [35] Offline Reinforcement Learning for Mobile Notifications
    Yuan, Yiping
    Muralidharan, Ajith
    Nandy, Preetam
    Cheng, Miao
    Prabhakar, Prakruthi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 3614 - 3623
  • [36] Efficient Transfer Learning Method for Automatic Hyperparameter Tuning
    Yogatama, Dani
    Mann, Gideon
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 1077 - 1085
  • [37] Federated Learning Hyperparameter Tuning From a System Perspective
    Zhang, Huanle
    Fu, Lei
    Zhang, Mi
    Hu, Pengfei
    Cheng, Xiuzhen
    Mohapatra, Prasant
    Liu, Xin
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (16) : 14102 - 14113
  • [38] Enabling Hyperparameter Tuning of Machine Learning Classifiers in Production
    Sandha, Sandeep Singh
    Aggarwal, Mohit
    Saha, Swapnil Sayan
    Srivastava, Mani
    2021 IEEE THIRD INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2021), 2021, : 262 - 271
  • [39] Exploring Hyperparameter Usage and Tuning in Machine Learning Research
    Simon, Sebastian
    Kolyada, Nikolay
    Akiki, Christopher
    Potthast, Martin
    Stein, Benno
    Siegmund, Norbert
    2023 IEEE/ACM 2ND INTERNATIONAL CONFERENCE ON AI ENGINEERING - SOFTWARE ENGINEERING FOR AI, CAIN, 2023, : 68 - 79
  • [40] Efficient Deep Learning Hyperparameter Tuning using Cloud Infrastructure Intelligent Distributed Hyperparameter tuning with Bayesian Optimization in the Cloud
    Ranjit, Mercy Prasanna
    Ganapathy, Gopinath
    Sridhar, Kalaivani
    Arumugham, Vikram
    2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019), 2019, : 520 - 522