Policy Learning with an Effcient Black-Box Optimization Algorithm

被引：1

作者：

Hwangbo, Jemin ^{[1
]}

Gehring, Christian ^{[1
]}

Sommer, Hannes ^{[1
]}

Siegwart, Roland ^{[1
]}

Buchli, Jonas ^{[2
]}

机构：

[1] Swiss Fed Inst Technol, Autonomous Syst Lab, Zurich, Switzerland

[2] Swiss Fed Inst Technol, Agile & Dexterous Robot Lab, Zurich, Switzerland

来源：

INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS | 2015年 / 12卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

Policy optimization; robotic learning; black-box optimization;

D O I：

10.1142/S0219843615500292

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK*). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. The performance of our method is evaluated with standard benchmark functions that are commonly used to test optimization algorithms. We also present three differerent robotic optimization examples using ROCK*. The first robotic example is on a simulated robot arm, the second is on a real articulated legged system, and the third is on a simulated quadruped robot with 12 actuated joints. ROCK* outperforms the current state-of-the- art algorithms in all tasks sometimes even by an order of magnitude.

引用

页数：20

共 50 条

[21] A model for analyzing black-box optimization
Phan, Vinhthuy
Skiena, Steven
Sumazin, Pavel
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2748 : 424 - 438
[22] A model for analyzing black-box optimization
Phan, V
Skiena, S
Sumazin, P
ALGORITHMS AND DATA STRUCTURES, PROCEEDINGS, 2003, 2748 : 424 - 438
[23] Adaptive sampling Bayesian algorithm for constrained black-box optimization problems
Fan, Shuyuan
Hong, Xiaodong
Liao, Zuwei
Ren, Congjing
Yang, Yao
Wang, Jingdai
Yang, Yongrong
AICHE JOURNAL, 2025, 71 (04)
[24] Discovering Representations for Black-box Optimization
Gaier, Adam
Asteroth, Alexander
Mouret, Jean-Baptiste
GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2020, : 103 - 111
[25] Learning Photo Enhancement by Black-Box Model Optimization Data Generation
Omiya, Mayu
Simo-Serra, Edgar
Iizuka, Satoshi
Ishikawa, Hiroshi
SA'18: SIGGRAPH ASIA 2018 TECHNICAL BRIEFS, 2018,
[26] Multi-Agent Active Learning for Distributed Black-Box Optimization
Cannelli, Loris
Zhu, Mengjia
Farina, Francesco
Bemporad, Alberto
Piga, Dario
IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 1488 - 1493
[27] Black-box learning of multigrid parameters
Katrutsa, Alexandr
Daulbaev, Talgat
Oseledets, Ivan
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2020, 368 (368)
[28] Black-box electronics and passive learning
Hess, Karl
PHYSICS TODAY, 2014, 67 (02) : 11 - 12
[29] Active Learning in Black-Box Settings
Rubens, Neil
Sheinman, Vera
Tomioka, Ryota
Sugiyama, Masashi
AUSTRIAN JOURNAL OF STATISTICS, 2011, 40 (1-2) : 125 - 135
[30] Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning
Huttenrauch, Maximilian
Neumann, Gerhard
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 44

← 1 2 3 4 5 →