Comparative Analysis of Reinforcement Learning Algorithms for Bipedal Robot Locomotion

被引：0

作者：

Aydogmus, Omur ^{[1
]}

Yilmaz, Musa ^{[2
,3
]}

机构：

[1] Fırat Univ, Fac Technol, Dept Mechatron Engn, TR-23119 Elazig, Turkiye

[2] Univ Calif Riverside, Bourns Coll Engn, Ctr Environm Res & Technol, Riverside, CA 92507 USA

[3] Batman Univ, Dept Elect & Elect Engn, TR-72100 Batman, Turkiye

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Robots; Legged locomotion; Training; Optimization; Reinforcement learning; Task analysis; Stability analysis; Hyperparameter optimization; Robot motion; reinforcement learning; robot motion; WALKING;

D O I：

10.1109/ACCESS.2023.3344393

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this research, an optimization methodology was introduced for improving bipedal robot locomotion controlled by reinforcement learning (RL) algorithms. Specifically, the study focused on optimizing the Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), Soft Actor-Critic (SAC), and Twin Delayed Deep Deterministic Policy Gradients (TD3) algorithms. The optimization process utilized the Tree-structured Parzen Estimator (TPE), a Bayesian optimization technique. All RL algorithms were applied to the same environment, which was created within the OpenAI GYM framework and known as the bipedal walker. The optimization involved the fine-tuning of key hyperparameters, including learning rate, discount factor, generalized advantage estimation, entropy coefficient, and Polyak update parameters. The study comprehensively analyzed the impact of these hyperparameters on the performance of RL algorithms. The results of the optimization efforts were promising, as the fine-tuned RL algorithms demonstrated significant improvements in performance. The mean reward values for the 10 trials were as follows: PPO achieved an average reward of 181.3, A2C obtained an average reward of -122.2, SAC reached an average reward of 320.3, and TD3 had an average reward of 278.6. These outcomes underscore the effectiveness of the optimization approach in enhancing the locomotion capabilities of the bipedal robot using RL techniques.

引用

页码：7490 / 7499

页数：10

共 50 条

[1] Computed-Torque Control of a Simulated Bipedal Robot with Locomotion by Reinforcement Learning
Valle, Carlos Magno C. O.
Tanscheit, Ricardo
Forero Mendoza, Leonardo A.
2016 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2016,
[2] Hybrid Bipedal Locomotion Based on Reinforcement Learning and Heuristics
Wang, Zhicheng
Wei, Wandi
Xie, Anhuan
Zhang, Yifeng
Wu, Jun
Zhu, Qiuguo
MICROMACHINES, 2022, 13 (10)
[3] Learning Bipedal Robot Locomotion from Human Movement
Taylor, Michael
Bashkirov, Sergey
Rico, Javier Fernandez
Toriyama, Ike
Miyada, Naoyuki
Yanagisawa, Hideki
Ishizuka, Kensaku
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 2797 - 2803
[4] Reinforcement learning for versatile, dynamic, and robust bipedal locomotion control
Li, Zhongyu
Peng, Xue Bin
Abbeel, Pieter
Levine, Sergey
Berseth, Glen
Sreenath, Koushil
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2024,
[5] Machine Learning Algorithms in Bipedal Robot Control
Wang, Shouyi
Chaovalitwongse, Wanpracha
Babuska, Robert
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (05): : 728 - 743
[6] Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots
Li, Zhongyu
Cheng, Xuxin
Peng, Xue Bin
Abbeel, Pieter
Levine, Sergey
Berseth, Glen
Sreenath, Koushil
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 2811 - 2817
[7] Deep Reinforcement Learning for Snake Robot Locomotion
Shi, Junyao
Dear, Tony
Kelly, Scott David
IFAC PAPERSONLINE, 2020, 53 (02): : 9688 - 9695
[8] Continual Reinforcement Learning for Quadruped Robot Locomotion
Gai, Sibo
Lyu, Shangke
Zhang, Hongyin
Wang, Donglin
ENTROPY, 2024, 26 (01)
[9] Learning agile soccer skills for a bipedal robot with deep reinforcement learning
Haarnoja, Tuomas
Moran, Ben
Lever, Guy
Huang, Sandy H.
Tirumala, Dhruva
Humplik, Jan
Wulfmeier, Markus
Tunyasuvunakool, Saran
Siegel, Noah Y.
Hafner, Roland
Bloesch, Michael
Hartikainen, Kristian
Byravan, Arunkumar
Hasenclever, Leonard
Tassa, Yuval
Sadeghi, Fereshteh
Batchelor, Nathan
Casarini, Federico
Saliceti, Stefano
Game, Charles
Sreendra, Neil
Patel, Kushal
Gwira, Marlon
Huber, Andrea
Hurley, Nicole
Nori, Francesco
Hadsell, Raia
Heess, Nicolas
SCIENCE ROBOTICS, 2024, 9 (89)
[10] Learning Robust Locomotion for Bipedal Robot via Embedded Mechanics Properties
Yuanxi Zhang
Xuechao Chen
Fei Meng
Zhangguo Yu
Yidong Du
Junyao Gao
Qiang Huang
Journal of Bionic Engineering, 2024, 21 : 1278 - 1289

← 1 2 3 4 5 →