Deriving and Improving CMA-ES with Information Geometric Trust Regions

被引：16

作者：

Abdolmaleki, Abbas ^{[1
,2
,3
]}

Price, Bob ^{[1
]}

Lau, Nuno ^{[4
]}

Reis, Luis Paulo ^{[3
,5
]}

Neumann, Gerhard ^{[6
,7
]}

机构：

[1] Palo Alto Res Ctr, PARC, Palo Alto, CA 94304 USA

[2] Univ Aveiro, IEETA, Aveiro, Portugal

[3] Univ Porto, LIACC, Porto, Portugal

[4] Univ Porto, IEETA, DETI, Porto, Portugal

[5] Univ Minho, DSI, Braga, Portugal

[6] Tech Univ Darmstadt, CLAS, Darmstadt, Germany

[7] Univ Lincoln, L CAS, Lincoln, England

来源：

PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17) | 2017年

关键词：

Stochastic Search; Expectation Maximisation; Trust Regions;

D O I：

10.1145/3071178.3071252

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

CMA-ES is one of the most popular stochastic search algorithms. It performs favourably in many tasks without the need of extensive parameter tuning. The algorithm has many beneficial properties, including automatic step-size adaptation, efficient covariance updates that incorporates the current samples as well as the evolution path and its invariance properties. Its update rules are composed of well established heuristics where the theoretical foundations of some of these rules are also well understood. In this paper we will fully derive all CMA-ES update rules within the framework of expectation-maximisation-based stochastic search algorithms using information-geometric trust regions. We show that the use of the trust region results in similar updates to CMA-ES for the mean and the covariance matrix while it allows for the derivation of an improved update rule for the step-size. Our new algorithm, Trust-Region Covariance Matrix Adaptation Evolution Strategy (TR-CMA-ES) is fully derived from first order optimization principles and performs favourably in compare to standard CMA-ES algorithm.

引用

页码：657 / 664

页数：8

共 19 条

[1]

Abdolmaleki A., 2015, HUM C

[2]

Abdolmaleki A., 2015, NIPS

[3]

Akimoto Y., 2014, GECCO

[4]

Akimoto Y., 2012, Algorithmica

[5]

Akimoto Y., 2008, GECCO

[6] Natural gradient works efficiently in learning [J].

Amari, S .

NEURAL COMPUTATION, 1998, 10 (02) :251-276

[7]

[Anonymous], 2003, Advances in Neural Information Processing Systems

[8]

[Anonymous], 2015, ICML

[9]

[Anonymous], 2010, AAAI

[10] Using expectation-maximization for reinforcement learning [J].

Dayan, P ;

Hinton, GE .

NEURAL COMPUTATION, 1997, 9 (02) :271-278

← 1 2 →