Automatic Temperature Parameter Tuning for Reinforcement Learning Using Path Integral Policy Improvement

被引:0
|
作者
Nakano, Hiroyasu [1 ]
Ariizumi, Ryo [2 ]
Asai, Toru [1 ]
Azuma, Shun-Ichi [3 ]
机构
[1] Nagoya Univ, Dept Mech Syst Engn, Nagoya, Aichi 4648603, Japan
[2] Tokyo Univ Agr & Technol, Dept Mech Syst Engn, Yoshida Honmachi, Tokyo 1848588, Japan
[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
基金
日本学术振兴会; 日本科学技术振兴机构;
关键词
Robots; Tuning; Task analysis; Covariance matrices; Trajectory; Legged locomotion; Learning systems; Legged robot; policy improvement; reinforcement learning (RL); robotics; snake robot;
D O I
10.1109/TNNLS.2023.3312857
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a novel variant of path integral policy improvement with covariance matrix adaptation (PI2-CMA), which is a reinforcement learning (RL) algorithm that aims to optimize a parameterized policy for the continuous behavior of robots. PI2-CMA has a hyperparameter called the temperature parameter, and its value is critical for performance; however, little research has been conducted on it and the existing method still contains a tunable parameter, which can be critical to performance. Therefore, tuning by trial and error is necessary in the existing method. Moreover, we show that there is a problem setting that cannot be learned by the existing method. The proposed method solves both problems by automatically adjusting the temperature parameter for each update. We confirmed the effectiveness of the proposed method using numerical tests.
引用
收藏
页码:18200 / 18211
页数:12
相关论文
共 50 条
  • [21] Tuning pianos using reinforcement learning
    Millard, Matthew
    Tizhoosh, Hamid R.
    APPLIED ACOUSTICS, 2007, 68 (05) : 576 - 593
  • [22] Automatic collective motion tuning using actor-critic deep reinforcement learning
    Abpeikar, Shadi
    Kasmarik, Kathryn
    Garratt, Matthew
    Hunjet, Robert
    Khan, Md Mohiuddin
    Qiu, Huanneng
    SWARM AND EVOLUTIONARY COMPUTATION, 2022, 72
  • [23] Automatic Hyperparameter Tuning in Deep Convolutional Neural Networks Using Asynchronous Reinforcement Learning
    Neary, Patrick L.
    2018 IEEE INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING (ICCC), 2018, : 73 - 77
  • [24] Predictive path following control for mobile robots with automatic parameter tuning
    Tika, Argtim
    Hiremath, Sandesh
    Bajcinca, Naim
    2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
  • [25] Continuous Parameter Control in Genetic Algorithms using Policy Gradient Reinforcement Learning
    de Miguel Gomez, Alejandro
    Toosi, Farshad Ghassemi
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 115 - 122
  • [26] Deep Reinforcement Learning for Parameter Tuning of Robot Visual Servoing
    Xu, Meng
    Wang, Jianping
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [27] Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning
    Rosbach, Sascha
    Li, Xing
    Grossjohann, Simon
    Homoceanu, Silviu
    Roth, Stefan
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5187 - 5193
  • [28] CTuner: Automatic NoSQL Database Tuning with Causal Reinforcement Learning
    Mai, Genting
    He, Zilong
    Yu, Guangba
    Chen, Zhiming
    Chen, Pengfei
    PROCEEDINGS OF THE 15TH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, INTERNETWARE 2024, 2024, : 269 - 278
  • [29] Automatic depression severity assessment with deep learning using parameter-efficient tuning
    Lau, Clinton
    Zhu, Xiaodan
    Chan, Wai-Yip
    FRONTIERS IN PSYCHIATRY, 2023, 14
  • [30] Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement
    Barreto, Andre
    Borsa, Diana
    Quan, John
    Schaul, Tom
    Silver, David
    Hessel, Matteo
    Mankowitz, Daniel
    Zidek, Augustin
    Munos, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80