Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

被引:0
|
作者
Huttenrauch, Maximilian [1 ]
Neumann, Gerhard [1 ]
机构
[1] Karlsruhe Inst Technol, Dept Comp Sci, Karlsruhe, Germany
关键词
black-box optimization; stochastic search; derivative-free optimization; evolution strategies; episodic reinforcement learning; EVOLUTIONARY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Black -box optimization is a versatile approach to solve complex problems where the objective function is not explicitly known and no higher order information is available. Due to its general nature, it finds widespread applications in function optimization as well as machine learning, especially episodic reinforcement learning tasks. While traditional black -box optimizers like CMA-ES may falter in noisy scenarios due to their reliance on ranking -based transformations, a promising alternative emerges in the form of the Model -based Relative Entropy Stochastic Search (MORE) algorithm. MORE can be derived from natural policy gradients and compatible function approximation and directly optimizes the expected fitness without resorting to rankings. However, in its original formulation, MORE often cannot achieve state of the art performance. In this paper, we improve MORE by decoupling the update of the search distribution's mean and covariance and an improved entropy scheduling technique based on an evolution path resulting in faster convergence, and a simplified model learning approach in comparison to the original paper. We show that our algorithm performs comparable to state-of-the-art black -box optimizers on standard benchmark functions. Further, it clearly outperforms ranking -based methods and other policy -gradient based black -box algorithms as well as state of the art deep reinforcement learning algorithms when used for episodic reinforcement learning tasks.
引用
收藏
页码:1 / 44
页数:44
相关论文
共 50 条
  • [1] Approximation Algorithms for Distributionally-Robust Stochastic Optimization with Black-Box Distributions
    Linhares, Andre
    Swamy, Chaitanya
    PROCEEDINGS OF THE 51ST ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '19), 2019, : 768 - 779
  • [2] Directed Exploration in Black-Box Optimization for Multi-Objective Reinforcement Learning
    Garcia, Javier
    Iglesias, Roberto
    Rodriguez, Miguel A.
    Regueiro, Carlos, V
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2019, 18 (03) : 1045 - 1082
  • [3] Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning
    Usmanova, Ilnura
    As, Yarden
    Kamgarpour, Maryam
    Krause, Andreas
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [4] A robust and efficient triangulation-based optimization algorithm for stochastic black-box systems
    McGill, J. A.
    Ogunnaike, B. A.
    Vlachos, D. G.
    COMPUTERS & CHEMICAL ENGINEERING, 2014, 60 : 143 - 153
  • [5] Sparse Black-Box Video Attack with Reinforcement Learning
    Xingxing Wei
    Huanqian Yan
    Bo Li
    International Journal of Computer Vision, 2022, 130 : 1459 - 1473
  • [6] Sparse Black-Box Video Attack with Reinforcement Learning
    Wei, Xingxing
    Yan, Huanqian
    Li, Bo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (06) : 1459 - 1473
  • [7] Deep Black-Box Reinforcement Learning with Movement Primitives
    Otto, Fabian
    Celik, Onur
    Zhou, Hongyi
    Ziesche, Hanna
    Ngo Anh Vien
    Neumann, Gerhard
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1244 - 1265
  • [8] Distributed Evolution Strategies for Black-Box Stochastic Optimization
    He, Xiaoyu
    Zheng, Zibin
    Chen, Chuan
    Zhou, Yuren
    Luo, Chuan
    Lin, Qingwei
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3718 - 3731
  • [9] Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search
    Wang, Linnan
    Fonseca, Rodrigo
    Tian, Yuandong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Meta-Learning for Black-Box Optimization
    Vishnu, T. V.
    Malhotra, Pankaj
    Narwariya, Jyoti
    Vig, Lovekesh
    Shroff, Gautam
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 366 - 381