Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

被引：0

作者：

Huttenrauch, Maximilian ^{[1
]}

Neumann, Gerhard ^{[1
]}

机构：

[1] Karlsruhe Inst Technol, Dept Comp Sci, Karlsruhe, Germany

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2024年 / 25卷

关键词：

black-box optimization; stochastic search; derivative-free optimization; evolution strategies; episodic reinforcement learning; EVOLUTIONARY;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Black -box optimization is a versatile approach to solve complex problems where the objective function is not explicitly known and no higher order information is available. Due to its general nature, it finds widespread applications in function optimization as well as machine learning, especially episodic reinforcement learning tasks. While traditional black -box optimizers like CMA-ES may falter in noisy scenarios due to their reliance on ranking -based transformations, a promising alternative emerges in the form of the Model -based Relative Entropy Stochastic Search (MORE) algorithm. MORE can be derived from natural policy gradients and compatible function approximation and directly optimizes the expected fitness without resorting to rankings. However, in its original formulation, MORE often cannot achieve state of the art performance. In this paper, we improve MORE by decoupling the update of the search distribution's mean and covariance and an improved entropy scheduling technique based on an evolution path resulting in faster convergence, and a simplified model learning approach in comparison to the original paper. We show that our algorithm performs comparable to state-of-the-art black -box optimizers on standard benchmark functions. Further, it clearly outperforms ranking -based methods and other policy -gradient based black -box algorithms as well as state of the art deep reinforcement learning algorithms when used for episodic reinforcement learning tasks.

引用

页码：1 / 44

页数：44

共 50 条

[1] Approximation Algorithms for Distributionally-Robust Stochastic Optimization with Black-Box Distributions
Linhares, Andre
Swamy, Chaitanya
PROCEEDINGS OF THE 51ST ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '19), 2019, : 768 - 779
[2] Directed Exploration in Black-Box Optimization for Multi-Objective Reinforcement Learning
Garcia, Javier
Iglesias, Roberto
Rodriguez, Miguel A.
Regueiro, Carlos, V
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2019, 18 (03) : 1045 - 1082
[3] Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning
Usmanova, Ilnura
As, Yarden
Kamgarpour, Maryam
Krause, Andreas
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[4] A robust and efficient triangulation-based optimization algorithm for stochastic black-box systems
McGill, J. A.
Ogunnaike, B. A.
Vlachos, D. G.
COMPUTERS & CHEMICAL ENGINEERING, 2014, 60 : 143 - 153
[5] Sparse Black-Box Video Attack with Reinforcement Learning
Xingxing Wei
Huanqian Yan
Bo Li
International Journal of Computer Vision, 2022, 130 : 1459 - 1473
[6] Sparse Black-Box Video Attack with Reinforcement Learning
Wei, Xingxing
Yan, Huanqian
Li, Bo
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (06) : 1459 - 1473
[7] Deep Black-Box Reinforcement Learning with Movement Primitives
Otto, Fabian
Celik, Onur
Zhou, Hongyi
Ziesche, Hanna
Ngo Anh Vien
Neumann, Gerhard
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1244 - 1265
[8] Distributed Evolution Strategies for Black-Box Stochastic Optimization
He, Xiaoyu
Zheng, Zibin
Chen, Chuan
Zhou, Yuren
Luo, Chuan
Lin, Qingwei
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3718 - 3731
[9] Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search
Wang, Linnan
Fonseca, Rodrigo
Tian, Yuandong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[10] Meta-Learning for Black-Box Optimization
Vishnu, T. V.
Malhotra, Pankaj
Narwariya, Jyoti
Vig, Lovekesh
Shroff, Gautam
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 366 - 381

← 1 2 3 4 5 →