Automatic Tuning of Stochastic Gradient Descent with Bayesian Optimisation

被引：1

作者：

Picheny, Victor ^{[1
]}

Dutordoir, Vincent ^{[1
]}

Artemev, Artem ^{[1
]}

Durrande, Nicolas ^{[1
]}

机构：

[1] PROWLER Io, 72 Hills Rd, Cambridge CB2 1LA, England

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT III | 2021年 / 12459卷

关键词：

Learning rate; Gaussian process; Variational inference;

D O I：

10.1007/978-3-030-67664-3_26

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many machine learning models require a training procedure based on running stochastic gradient descent. A key element for the efficiency of those algorithms is the choice of the learning rate schedule. While finding good learning rates schedules using Bayesian optimisation has been tackled by several authors, adapting it dynamically in a data-driven way is an open question. This is of high practical importance to users that need to train a single, expensive model. To tackle this problem, we introduce an original probabilistic model for traces of optimisers, based on latent Gaussian processes and an auto-/regressive formulation, that flexibly adjusts to abrupt changes of behaviours induced by new learning rate values. As illustrated, this model is well-suited to tackle a set of problems: first, for the on-line adaptation of the learning rate for a cold-started run; then, for tuning the schedule for a set of similar tasks (in a classical BO setup), as well as warm-starting it for a new task.

引用

页码：431 / 446

页数：16

共 50 条

[41] STOCHASTIC MODIFIED FLOWS FOR RIEMANNIAN STOCHASTIC GRADIENT DESCENT
Gess, Benjamin
Kassing, Sebastian
Rana, Nimit
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (06) : 3288 - 3314
[42] A Stochastic Gradient Descent Approach for Stochastic Optimal Control
Archibald, Richard
Bao, Feng
Yong, Jiongmin
EAST ASIAN JOURNAL ON APPLIED MATHEMATICS, 2020, 10 (04) : 635 - 658
[43] Stochastic modified equations for the asynchronous stochastic gradient descent
An, Jing
Lu, Jianfeng
Ying, Lexing
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (04) : 851 - 873
[44] Energy-Aware Automatic Tuning of Many-Core Platform via Gradient Descent
Akiki, Samer
Yang, Zhiliu
Liu, Chen
Tang, Jie
Liu, Shaoshan
2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 1199 - 1203
[45] Learning to Race through Coordinate Descent Bayesian Optimisation
Oliveira, Rafael
Rocha, Fernando H. M.
Ott, Lionel
Guizilini, Vitor
Ramos, Fabio
Grassi, Valdir, Jr.
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 6431 - 6438
[46] On-line automatic controller tuning of a multivariable grinding mill circuit using Bayesian optimisation
van Niekerk, J. A.
le Roux, J. D.
Craig, I. K.
JOURNAL OF PROCESS CONTROL, 2023, 128
[47] An automatic learning rate decay strategy for stochastic gradient descent optimization methods in neural networks
Wang, Kang
Dou, Yong
Sun, Tao
Qiao, Peng
Wen, Dong
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 7334 - 7355
[48] Acoustic Model Optimization Based On Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition
Cui, Xiaodong
Picheny, Michael
INTERSPEECH 2019, 2019, : 1581 - 1585
[49] On the convergence and improvement of stochastic normalized gradient descent
Shen-Yi ZHAO
Yin-Peng XIE
Wu-Jun LI
ScienceChina(InformationSciences), 2021, 64 (03) : 105 - 117
[50] STOCHASTIC GRADIENT DESCENT WITH FINITE SAMPLES SIZES
Yuan, Kun
Ying, Bicheng
Vlaski, Stefan
Sayed, Ali H.
2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,

← 1 2 3 4 5 →