Comparative Analysis of Reinforcement Learning Algorithms for Bipedal Robot Locomotion

被引：0

作者：

Aydogmus, Omur ^{[1
]}

Yilmaz, Musa ^{[2
,3
]}

机构：

[1] Fırat Univ, Fac Technol, Dept Mechatron Engn, TR-23119 Elazig, Turkiye

[2] Univ Calif Riverside, Bourns Coll Engn, Ctr Environm Res & Technol, Riverside, CA 92507 USA

[3] Batman Univ, Dept Elect & Elect Engn, TR-72100 Batman, Turkiye

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Robots; Legged locomotion; Training; Optimization; Reinforcement learning; Task analysis; Stability analysis; Hyperparameter optimization; Robot motion; reinforcement learning; robot motion; WALKING;

D O I：

10.1109/ACCESS.2023.3344393

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this research, an optimization methodology was introduced for improving bipedal robot locomotion controlled by reinforcement learning (RL) algorithms. Specifically, the study focused on optimizing the Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), Soft Actor-Critic (SAC), and Twin Delayed Deep Deterministic Policy Gradients (TD3) algorithms. The optimization process utilized the Tree-structured Parzen Estimator (TPE), a Bayesian optimization technique. All RL algorithms were applied to the same environment, which was created within the OpenAI GYM framework and known as the bipedal walker. The optimization involved the fine-tuning of key hyperparameters, including learning rate, discount factor, generalized advantage estimation, entropy coefficient, and Polyak update parameters. The study comprehensively analyzed the impact of these hyperparameters on the performance of RL algorithms. The results of the optimization efforts were promising, as the fine-tuned RL algorithms demonstrated significant improvements in performance. The mean reward values for the 10 trials were as follows: PPO achieved an average reward of 181.3, A2C obtained an average reward of -122.2, SAC reached an average reward of 320.3, and TD3 had an average reward of 278.6. These outcomes underscore the effectiveness of the optimization approach in enhancing the locomotion capabilities of the bipedal robot using RL techniques.

引用

页码：7490 / 7499

页数：10

共 50 条

[21] Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot
Ji, Chao
Liu, Diyuan
Gao, Wei
Zhang, Shiwu
BIOMIMETIC INTELLIGENCE AND ROBOTICS, 2025, 5 (01):
[22] The Locomotion of Bipedal Walking Robot with Six Degree of Freedom
Lim, Seong Chiun
Yeap, Gik Hong
INTERNATIONAL SYMPOSIUM ON ROBOTICS AND INTELLIGENT SENSORS 2012 (IRIS 2012), 2012, 41 : 8 - 14
[23] Comparative intralimb coordination in avian bipedal locomotion
Stoessel, Alexander
Fischer, Martin S.
JOURNAL OF EXPERIMENTAL BIOLOGY, 2012, 215 (23): : 4055 - 4069
[24] Reinforcement Learning for Bipedal Gait with MAX-E2 Humanoid Robot
Yanguas-Rojas, David
Mojica-Nava, Eduardo
Cardenas, Alben
INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2022, 19 (05)
[25] A Multiobjective Collaborative Deep Reinforcement Learning Algorithm for Jumping Optimization of Bipedal Robot
Tao, Chongben
Li, Mengru
Cao, Feng
Gao, Zhen
Zhang, Zufeng
ADVANCED INTELLIGENT SYSTEMS, 2024, 6 (01)
[26] RH0 humanoid robot bipedal locomotion and navigation using Lie groups and geometric algorithms
Pardos, JM
Balaguer, C
2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vols 1-4, 2005, : 2109 - 2114
[27] Comparative study of learning and execution of bipedal by using forgetting mechanism in reinforcement learning algorithm
Sharma R.
Singh I.
Prateek M.
Pasricha A.
Sharma, Rashmi (rashminonumanu@gmail.com), 1600, International Information and Engineering Technology Association (53): : 335 - 343
[28] Caterpillar robot locomotion based on reinforcement learning using subjective reward
Yamashina, R. (ryota@yabsv.jks.ynu.ac.jp), 1600, Japan Society of Mechanical Engineers (79):
[29] Learning Task Space Actions for Bipedal Locomotion
Duan, Helei
Dao, Jeremy
Green, Kevin
Apgar, Taylor
Fern, Alan
Hurst, Jonathan
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 1276 - 1282
[30] Hybrid autonomous controller for bipedal robot balance with deep reinforcement learning and pattern generators
Kouppas, Christos
Saada, Mohamad
Meng, Qinggang
King, Mark
Majoe, Dennis
ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 146

← 1 2 3 4 5 →