Policy Learning with an Effcient Black-Box Optimization Algorithm

被引：1

作者：

Hwangbo, Jemin ^{[1
]}

Gehring, Christian ^{[1
]}

Sommer, Hannes ^{[1
]}

Siegwart, Roland ^{[1
]}

Buchli, Jonas ^{[2
]}

机构：

[1] Swiss Fed Inst Technol, Autonomous Syst Lab, Zurich, Switzerland

[2] Swiss Fed Inst Technol, Agile & Dexterous Robot Lab, Zurich, Switzerland

来源：

INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS | 2015年 / 12卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

Policy optimization; robotic learning; black-box optimization;

D O I：

10.1142/S0219843615500292

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK*). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK*. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.

引用

页数：20

共 50 条

[1] ROCK☆ - Efficient Black-box Optimization for Policy Learning
Hwangbo, Jemin
Gehring, Christian
Sommer, Hannes
Siegwart, Roland
Buchli, Jonas
2014 14TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2014, : 535 - 540
[2] Meta-Learning for Black-Box Optimization
Vishnu, T. V.
Malhotra, Pankaj
Narwariya, Jyoti
Vig, Lovekesh
Shroff, Gautam
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 366 - 381
[3] Online black-box algorithm portfolios for continuous optimization
20174004240282
(1) Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Cybernetics Technická 2, Prague 6; 166 27, Czech Republic, 1600, (Springer Verlag):
[4] Online Black-Box Algorithm Portfolios for Continuous Optimization
Baudis, Petr
Posik, Petr
PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIII, 2014, 8672 : 40 - 49
[5] Black-Box Function Aerodynamic Topology Optimization Algorithm via Machine Learning Technologies
Ban, Naohiko
Yamazaki, Wataru
AIAA JOURNAL, 2021, 59 (12) : 5174 - 5185
[6] Versatile Black-Box Optimization
Liu, Jialin
Moreau, Antoine
Preuss, Mike
Rapin, Jeremy
Roziere, Baptiste
Teytaud, Fabien
Teytaud, Olivier
GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2020, : 620 - 628
[7] Black-box Optimization with a Politician
Bubeck, Sebastien
Lee, Yin-Tat
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[8] Sampling Effects on Algorithm Selection for Continuous Black-Box Optimization
Munoz, Mario Andres
Kirley, Michael
ALGORITHMS, 2021, 14 (01)
[9] Comparing Algorithm Selection Approaches on Black-Box Optimization Problems
Kostovska, Ana
Jankovic, Anja
Vermetten, Diederick
Dzeroski, Saso
Eftimov, Tome
Doerr, Carola
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 495 - 498
[10] Bayesian Active Meta-Learning for Black-Box Optimization
Nikoloska, Ivana
Simeone, Osvaldo
2022 IEEE 23RD INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATION (SPAWC), 2022,

← 1 2 3 4 5 →