High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning

被引:26
|
作者
Jin, Yongbin [1 ,2 ,3 ,4 ]
Liu, Xianwei [1 ]
Shao, Yecheng [1 ,4 ]
Wang, Hongtao [1 ,2 ,3 ,4 ]
Yang, Wei [1 ,2 ,3 ,4 ]
机构
[1] Zhejiang Univ, Ctr X Mech, Hangzhou, Peoples R China
[2] Hangzhou Global Sci & Technol Innovat Ctr, ZJU, Hangzhou, Peoples R China
[3] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou, Peoples R China
[4] Zhejiang Univ, Inst Appl Mech, Hangzhou, Peoples R China
关键词
ENTROPY STABILITY; DYNAMICS; DESIGN; ROBOT; MODEL;
D O I
10.1038/s42256-022-00576-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fast and stable locomotion of legged robots involves demanding and contradictory requirements, in particular rapid control frequency as well as an accurate dynamics model. Benefiting from universal approximation ability and offline optimization of neural networks, reinforcement learning has been used to solve various challenging problems in legged robot locomotion; however, the optimal control of quadruped robot requires optimizing multiple objectives such as keeping balance, improving efficiency, realizing periodic gait and following commands. These objectives cannot always be achieved simultaneously, especially at high speed. Here, we introduce an imitation-relaxation reinforcement learning (IRRL) method to optimize the objectives in stages. To bridge the gap between simulation and reality, we further introduce the concept of stochastic stability into system robustness analysis. The state space entropy decreasing rate is a quantitative metric and can sharply capture the occurrence of period-doubling bifurcation and possible chaos. By employing IRRL in training and the stochastic stability analysis, we are able to demonstrate a stable running speed of 5.0 m s(-1) for a MIT-MiniCheetah-like robot.
引用
收藏
页码:1198 / 1208
页数:11
相关论文
共 50 条
  • [21] Learning in a high dimensional space: Fast omnidirectional quadrupedal locomotion
    Hebbel, Matthias
    Nistico, Walter
    Fisseler, Denis
    ROBOCUP 2006: ROBOT SOCCER WORLD CUP X, 2007, 4434 : 314 - +
  • [22] Application of Reinforcement Learning on High-Speed Rail Cognitive Radio
    Wu, Qing-ting
    Wu, Cheng
    Wang, Yi-ming
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: TECHNIQUES AND APPLICATIONS, AITA 2016, 2016, : 332 - 336
  • [23] Real-Time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning
    Gangapurwala, Siddhant
    Geisert, Mathieu
    Orsolino, Romeo
    Fallon, Maurice
    Havoutis, Ioannis
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 5973 - 5979
  • [24] Imitation-Enhanced Reinforcement Learning With Privileged Smooth Transition for Hexapod Locomotion
    Zhang, Zhelin
    Liu, Tie
    Ding, Liang
    Wang, Haoyu
    Xu, Peng
    Yang, Huaiguang
    Gao, Haibo
    Deng, Zongquan
    Pajarinen, Joni
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 350 - 357
  • [25] Economical Quadrupedal Multi-Gait Locomotion via Gait-Heuristic Reinforcement Learning
    Wei, Lang
    Zou, Jinzhou
    Yu, Xi
    Liu, Liangyu
    Liao, Jianbin
    Wang, Wei
    Zhang, Tong
    JOURNAL OF BIONIC ENGINEERING, 2024, 21 (04) : 1720 - 1732
  • [26] A reinforcement learning approach to congestion control of high-speed multimedia networks
    Shaio, MC
    Tan, SW
    Hwang, KS
    Wu, CS
    CYBERNETICS AND SYSTEMS, 2005, 36 (02) : 181 - 202
  • [27] A Deep Reinforcement Learning Approach for the Traffic Management of High-Speed Railways
    Wu, Wei
    Yin, Jiateng
    Pu, Fan
    Su, Shuai
    Tang, Tao
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2368 - 2373
  • [28] Reinforcement of high-speed steel ingots
    Baranov, A.A.
    Lyubchenko, A.G.
    Pashinskij, V.V.
    Antonov, V.V.
    Liteinoe Proizvodstvo, 1991, (05): : 34 - 35
  • [29] High-Speed Racing Reinforcement Learning Network: Learning the Environment Using Scene Graphs
    Shi, Jingjing
    Li, Ruiqin
    Yu, Daguo
    IEEE ACCESS, 2024, 12 : 116771 - 116785
  • [30] Reinforcement Learning-based Optimization of Speed Profile for High-speed Train with Temporary Speed Restriction
    Zhou M.
    Dong H.
    Zhou X.
    Xu W.
    Ning L.
    Tiedao Xuebao/Journal of the China Railway Society, 2023, 45 (02): : 84 - 92