META-GRADIENTS IN NON-STATIONARY ENVIRONMENTS

被引:0
|
作者
Luketina, Jelena [1 ,2 ]
Flennerhag, Sebastian [2 ]
Schroecker, Yannick [2 ]
Abel, David [2 ]
Zahavy, Tom [2 ]
Singh, Satinder [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] DeepMind, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta-gradient methods (Xu et al., 2018; Zahavy et al., 2020) offer a promising solution to the problem of hyperparameter selection and adaptation in non-stationary reinforcement learning problems. However, the properties of meta-gradients in such environments have not been systematically studied. In this work, we bring new clarity to meta-gradients in non-stationary environments. Concretely, we ask: (i) how much information should be given to the learned optimizers, so as to enable faster adaptation and generalization over a lifetime, (ii) what meta-optimizer functions are learned in this process, and (iii) whether meta-gradient methods provide a bigger advantage in highly non-stationary environments. To study the effect of information provided to the meta-optimizer, as in recent works (Flennerhag et al., 2022; Almeida et al., 2021), we replace the tuned meta-parameters of fixed update rules with learned meta-parameter functions of selected context features. The context features carry information about agent performance and changes in the environment and hence can inform learned meta-parameter schedules. We find that adding more contextual information is generally beneficial, leading to faster adaptation of meta-parameter values and increased performance. We support these results with a qualitative analysis of resulting meta-parameter schedules and learned functions of context features. Lastly, we find that without context, meta-gradients do not provide a consistent advantage over the baseline in highly non-stationary environments. Our findings suggest that contextualising meta-gradients can play a pivotal role in extracting high performance from meta-gradients in non-stationary settings.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Optimistic Meta-Gradients
    Flennerhag, Sebastian
    Zahavy, Tom
    O'Donoghue, Brendan
    van Hasselt, Hado
    Gyorgy, Andras
    Singh, Satinder
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Meta-Reinforcement Learning in Non-Stationary and Dynamic Environments
    Bing, Zhenshan
    Lerch, David
    Huang, Kai
    Knoll, Alois
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3476 - 3491
  • [3] Meta-learning optimal parameter values in non-stationary environments
    Sikora, Riyaz T.
    KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 800 - 806
  • [4] Detection and estimation in non-stationary environments
    Toolan, TM
    Tufts, DW
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 797 - 801
  • [5] Adaptive beamforming in non-stationary environments
    Cox, H
    THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 431 - 438
  • [6] Rewiring Neurons in Non-Stationary Environments
    Sun, Zhicheng
    Mu, Yadong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Social Learning in non-stationary environments
    Boursier, Etienne
    Perchet, Vianney
    Scarsini, Marco
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 167, 2022, 167
  • [8] FLOODING RISK ASSESSMENT IN STATIONARY AND NON-STATIONARY ENVIRONMENTS
    Thomson, Rhys
    Drynan, Leo
    Ball, James
    Veldema, Ailsa
    Phillips, Brett
    Babister, Mark
    PROCEEDINGS OF THE 36TH IAHR WORLD CONGRESS: DELTAS OF THE FUTURE AND WHAT HAPPENS UPSTREAM, 2015, : 5167 - 5177
  • [9] An Ensemble Method for Incremental Classification in Stationary and Non-stationary Environments
    Nanculef, Ricardo
    Lopez, Erick
    Allende, Hector
    Allende-Cid, Hector
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, 2011, 7042 : 541 - 548
  • [10] Speech recognition in non-stationary adverse environments
    Wang, ZH
    Kenny, P
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 265 - 268