Policy Learning with an Effcient Black-Box Optimization Algorithm

被引:1
|
作者
Hwangbo, Jemin [1 ]
Gehring, Christian [1 ]
Sommer, Hannes [1 ]
Siegwart, Roland [1 ]
Buchli, Jonas [2 ]
机构
[1] Swiss Fed Inst Technol, Autonomous Syst Lab, Zurich, Switzerland
[2] Swiss Fed Inst Technol, Agile & Dexterous Robot Lab, Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Policy optimization; robotic learning; black-box optimization;
D O I
10.1142/S0219843615500292
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK*). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK*. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] A model for analyzing black-box optimization
    Phan, Vinhthuy
    Skiena, Steven
    Sumazin, Pavel
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2748 : 424 - 438
  • [22] A model for analyzing black-box optimization
    Phan, V
    Skiena, S
    Sumazin, P
    ALGORITHMS AND DATA STRUCTURES, PROCEEDINGS, 2003, 2748 : 424 - 438
  • [23] Adaptive sampling Bayesian algorithm for constrained black-box optimization problems
    Fan, Shuyuan
    Hong, Xiaodong
    Liao, Zuwei
    Ren, Congjing
    Yang, Yao
    Wang, Jingdai
    Yang, Yongrong
    AICHE JOURNAL, 2025, 71 (04)
  • [24] Discovering Representations for Black-box Optimization
    Gaier, Adam
    Asteroth, Alexander
    Mouret, Jean-Baptiste
    GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2020, : 103 - 111
  • [25] Learning Photo Enhancement by Black-Box Model Optimization Data Generation
    Omiya, Mayu
    Simo-Serra, Edgar
    Iizuka, Satoshi
    Ishikawa, Hiroshi
    SA'18: SIGGRAPH ASIA 2018 TECHNICAL BRIEFS, 2018,
  • [26] Multi-Agent Active Learning for Distributed Black-Box Optimization
    Cannelli, Loris
    Zhu, Mengjia
    Farina, Francesco
    Bemporad, Alberto
    Piga, Dario
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 1488 - 1493
  • [27] Black-box learning of multigrid parameters
    Katrutsa, Alexandr
    Daulbaev, Talgat
    Oseledets, Ivan
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2020, 368 (368)
  • [28] Black-box electronics and passive learning
    Hess, Karl
    PHYSICS TODAY, 2014, 67 (02) : 11 - 12
  • [29] Active Learning in Black-Box Settings
    Rubens, Neil
    Sheinman, Vera
    Tomioka, Ryota
    Sugiyama, Masashi
    AUSTRIAN JOURNAL OF STATISTICS, 2011, 40 (1-2) : 125 - 135
  • [30] Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning
    Huttenrauch, Maximilian
    Neumann, Gerhard
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 44