META-GRADIENTS IN NON-STATIONARY ENVIRONMENTS

被引:0
|
作者
Luketina, Jelena [1 ,2 ]
Flennerhag, Sebastian [2 ]
Schroecker, Yannick [2 ]
Abel, David [2 ]
Zahavy, Tom [2 ]
Singh, Satinder [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] DeepMind, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta-gradient methods (Xu et al., 2018; Zahavy et al., 2020) offer a promising solution to the problem of hyperparameter selection and adaptation in non-stationary reinforcement learning problems. However, the properties of meta-gradients in such environments have not been systematically studied. In this work, we bring new clarity to meta-gradients in non-stationary environments. Concretely, we ask: (i) how much information should be given to the learned optimizers, so as to enable faster adaptation and generalization over a lifetime, (ii) what meta-optimizer functions are learned in this process, and (iii) whether meta-gradient methods provide a bigger advantage in highly non-stationary environments. To study the effect of information provided to the meta-optimizer, as in recent works (Flennerhag et al., 2022; Almeida et al., 2021), we replace the tuned meta-parameters of fixed update rules with learned meta-parameter functions of selected context features. The context features carry information about agent performance and changes in the environment and hence can inform learned meta-parameter schedules. We find that adding more contextual information is generally beneficial, leading to faster adaptation of meta-parameter values and increased performance. We support these results with a qualitative analysis of resulting meta-parameter schedules and learned functions of context features. Lastly, we find that without context, meta-gradients do not provide a consistent advantage over the baseline in highly non-stationary environments. Our findings suggest that contextualising meta-gradients can play a pivotal role in extracting high performance from meta-gradients in non-stationary settings.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A heterogeneous online learning ensemble for non-stationary environments
    Idrees, Mobin M.
    Minku, Leandro L.
    Stahl, Frederic
    Badii, Atta
    KNOWLEDGE-BASED SYSTEMS, 2020, 188
  • [32] Reinforcement learning in episodic non-stationary Markovian environments
    Choi, SPM
    Zhang, NL
    Yeung, DY
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 752 - 758
  • [33] Sub-structural niching in non-stationary environments
    Sastry, T
    Abbass, LA
    Goldberg, T
    AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 873 - 885
  • [34] Learning spectrum opportunities in non-stationary radio environments
    Oksanen, Jan
    Koivunen, Visa
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2447 - 2451
  • [35] Stochastic Bandits with Graph Feedback in Non-Stationary Environments
    Lu, Shiyin
    Hu, Yao
    Zhang, Lijun
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8758 - 8766
  • [36] Minority games and distributed coordination in non-stationary environments
    Galstyan, A
    Lerman, K
    PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 2610 - 2614
  • [37] Tracking the Best Expert in Non-stationary Stochastic Environments
    Wei, Chen-Yu
    Hong, Yi-Te
    Lu, Chi-Jen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [38] Adaptive deep reinforcement learning for non-stationary environments
    Zhu, Jin
    Wei, Yutong
    Kang, Yu
    Jiang, Xiaofeng
    Dullerud, Geir E.
    SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (10)
  • [39] Adaptive deep reinforcement learning for non-stationary environments
    Jin Zhu
    Yutong Wei
    Yu Kang
    Xiaofeng Jiang
    Geir E. Dullerud
    Science China Information Sciences, 2022, 65
  • [40] Adaptive deep reinforcement learning for non-stationary environments
    Jin ZHU
    Yutong WEI
    Yu KANG
    Xiaofeng JIANG
    Geir E.DULLERUD
    ScienceChina(InformationSciences), 2022, 65 (10) : 225 - 241