Policy Learning with an Effcient Black-Box Optimization Algorithm

被引:1
|
作者
Hwangbo, Jemin [1 ]
Gehring, Christian [1 ]
Sommer, Hannes [1 ]
Siegwart, Roland [1 ]
Buchli, Jonas [2 ]
机构
[1] Swiss Fed Inst Technol, Autonomous Syst Lab, Zurich, Switzerland
[2] Swiss Fed Inst Technol, Agile & Dexterous Robot Lab, Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Policy optimization; robotic learning; black-box optimization;
D O I
10.1142/S0219843615500292
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK*). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK*. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] ROCK☆ - Efficient Black-box Optimization for Policy Learning
    Hwangbo, Jemin
    Gehring, Christian
    Sommer, Hannes
    Siegwart, Roland
    Buchli, Jonas
    2014 14TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2014, : 535 - 540
  • [2] Meta-Learning for Black-Box Optimization
    Vishnu, T. V.
    Malhotra, Pankaj
    Narwariya, Jyoti
    Vig, Lovekesh
    Shroff, Gautam
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 366 - 381
  • [3] Online black-box algorithm portfolios for continuous optimization
    20174004240282
    (1) Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Cybernetics Technická 2, Prague 6; 166 27, Czech Republic, 1600, (Springer Verlag):
  • [4] Online Black-Box Algorithm Portfolios for Continuous Optimization
    Baudis, Petr
    Posik, Petr
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIII, 2014, 8672 : 40 - 49
  • [5] Black-Box Function Aerodynamic Topology Optimization Algorithm via Machine Learning Technologies
    Ban, Naohiko
    Yamazaki, Wataru
    AIAA JOURNAL, 2021, 59 (12) : 5174 - 5185
  • [6] Versatile Black-Box Optimization
    Liu, Jialin
    Moreau, Antoine
    Preuss, Mike
    Rapin, Jeremy
    Roziere, Baptiste
    Teytaud, Fabien
    Teytaud, Olivier
    GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2020, : 620 - 628
  • [7] Black-box Optimization with a Politician
    Bubeck, Sebastien
    Lee, Yin-Tat
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [8] Sampling Effects on Algorithm Selection for Continuous Black-Box Optimization
    Munoz, Mario Andres
    Kirley, Michael
    ALGORITHMS, 2021, 14 (01)
  • [9] Comparing Algorithm Selection Approaches on Black-Box Optimization Problems
    Kostovska, Ana
    Jankovic, Anja
    Vermetten, Diederick
    Dzeroski, Saso
    Eftimov, Tome
    Doerr, Carola
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 495 - 498
  • [10] Bayesian Active Meta-Learning for Black-Box Optimization
    Nikoloska, Ivana
    Simeone, Osvaldo
    2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC), 2022,