Automatic Temperature Parameter Tuning for Reinforcement Learning Using Path Integral Policy Improvement

被引：0

作者：

Nakano, Hiroyasu ^{[1
]}

Ariizumi, Ryo ^{[2
]}

Asai, Toru ^{[1
]}

Azuma, Shun-Ichi ^{[3
]}

机构：

[1] Nagoya Univ, Dept Mech Syst Engn, Nagoya, Aichi 4648603, Japan

[2] Tokyo Univ Agr & Technol, Dept Mech Syst Engn, Yoshida Honmachi, Tokyo 1848588, Japan

[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 12期

基金：

日本学术振兴会; 日本科学技术振兴机构;

关键词：

Robots; Tuning; Task analysis; Covariance matrices; Trajectory; Legged locomotion; Learning systems; Legged robot; policy improvement; reinforcement learning (RL); robotics; snake robot;

D O I：

10.1109/TNNLS.2023.3312857

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, we propose a novel variant of path integral policy improvement with covariance matrix adaptation (PI2-CMA), which is a reinforcement learning (RL) algorithm that aims to optimize a parameterized policy for the continuous behavior of robots. PI2-CMA has a hyperparameter called the temperature parameter, and its value is critical for performance; however, little research has been conducted on it and the existing method still contains a tunable parameter, which can be critical to performance. Therefore, tuning by trial and error is necessary in the existing method. Moreover, we show that there is a problem setting that cannot be learned by the existing method. The proposed method solves both problems by automatically adjusting the temperature parameter for each update. We confirmed the effectiveness of the proposed method using numerical tests.

引用

页码：18200 / 18211

页数：12

共 50 条

[1] Multi-objective Reinforcement Learning with Path Integral Policy Improvement
Ariizumi, Ryo
Sago, Hayato
Asai, Toru
Azuma, Shun-ichi
2023 62ND ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS, SICE, 2023, : 1418 - 1423
[2] Automatic Multi-Parameter Tuning for Logic Synthesis with Reinforcement Learning
Cui, Zhenghao
Shen, Minghua
2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 318 - 323
[3] Automatic Parameter Tuning for Big Data Pipelines with Deep Reinforcement Learning
Sagaama, Houssem
Ben Slimane, Nourchene
Marwani, Maher
Skhiri, Sabri
26TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2021), 2021,
[4] PTDRL: Parameter Tuning using Deep Reinforcement Learning
Goldsztejn, Elias
Feiner, Tal
Brafman, Ronen
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 11356 - 11362
[5] Reinforcement Learning for Automatic Parameter Tuning in Apache Spark: A Q-Learning Approach
Deng, Mei
Huang, Zirui
Ren, Zhigang
2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 13 - 18
[6] Automatic path search for roving robot using reinforcement learning
Miyata, Shigeharu
Yanou, Akira
Nakamura, Hitomi
Takehara, Shin
ICIC Express Letters, 2010, 4 (03): : 885 - 892
[7] Tuning path tracking controllers for autonomous cars using reinforcement learning
Carrasco, Ana Vilaca
Sequeira, Joao Silva
PEERJ COMPUTER SCIENCE, 2023, 9
[8] Tuning path tracking controllers for autonomous cars using reinforcement learning
Carrasco A.V.
Sequeira J.S.
PeerJ Computer Science, 2023, 9
[9] Path Integral Policy Improvement With Population Adaptation
Yamamoto, Kosuke
Ariizumi, Ryo
Hayakawa, Tomohiro
Matsuno, Fumitoshi
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (01) : 312 - 322
[10] Automatic Tuning of MPI Runtime Parameter Settings by Using Machine Learning
Pellegrini, Simone
Fahringer, Thomas
Jordan, Herbert
Moritsch, Hans
PROCEEDINGS OF THE 2010 COMPUTING FRONTIERS CONFERENCE (CF 2010), 2010, : 115 - 116

← 1 2 3 4 5 →