Policy Learning with an Effcient Black-Box Optimization Algorithm

被引:1
|
作者
Hwangbo, Jemin [1 ]
Gehring, Christian [1 ]
Sommer, Hannes [1 ]
Siegwart, Roland [1 ]
Buchli, Jonas [2 ]
机构
[1] Swiss Fed Inst Technol, Autonomous Syst Lab, Zurich, Switzerland
[2] Swiss Fed Inst Technol, Agile & Dexterous Robot Lab, Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
Policy optimization; robotic learning; black-box optimization;
D O I
10.1142/S0219843615500292
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK*). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK*. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Safe non-smooth black-box optimization with application to policy search
    Usmanova, Ilnura
    Krause, Andreas
    Kamgarpour, Maryam
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 980 - 989
  • [32] Black-Box Policy Search with Probabilistic Programs
    van de Meent, Jan-Willem
    Paige, Brooks
    Tolpin, David
    Wood, Frank
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 1195 - 1204
  • [33] Combining a local search and Grover's algorithm in black-box global optimization
    Bulger, D. W.
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2007, 133 (03) : 289 - 301
  • [34] A Stochastic Adaptive Radial Basis Function Algorithm for Costly Black-Box Optimization
    Zhou Z.
    Bai F.-S.
    Journal of the Operations Research Society of China, 2018, 6 (4) : 587 - 609
  • [35] Implementation of a black-box global optimization algorithm with a parallel branch and bound template
    Ciegis, Raimondas
    Baravykaite, Milda
    APPLIED PARALLEL COMPUTING: STATE OF THE ART IN SCIENTIFIC COMPUTING, 2007, 4699 : 1115 - +
  • [36] Black-Box Optimization Benchmarking of Two Variants of the POEMS Algorithm on the Noiseless Testbed
    Kubalik, Jiri R.
    GECCO-2010 COMPANION PUBLICATION: PROCEEDINGS OF THE 12TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2010, : 1567 - 1573
  • [37] An adaptive radial basis algorithm (ARBF) for expensive black-box global optimization
    Kenneth Holmström
    Journal of Global Optimization, 2008, 41 : 447 - 464
  • [38] An adaptive radial basis algorithm (ARBF) for expensive black-box global optimization
    Holmstrom, Kenneth
    JOURNAL OF GLOBAL OPTIMIZATION, 2008, 41 (03) : 447 - 464
  • [39] Combining a Local Search and Grover’s Algorithm in Black-Box Global Optimization
    D. W. Bulger
    Journal of Optimization Theory and Applications, 2007, 133 : 289 - 301
  • [40] Algorithm selection for black-box continuous optimization problems: A survey on methods and challenges
    Munoz, Mario A.
    Sun, Yuan
    Kirley, Michael
    Halgamuge, Saman K.
    INFORMATION SCIENCES, 2015, 317 : 224 - 245