META-GRADIENTS IN NON-STATIONARY ENVIRONMENTS

被引:0
|
作者
Luketina, Jelena [1 ,2 ]
Flennerhag, Sebastian [2 ]
Schroecker, Yannick [2 ]
Abel, David [2 ]
Zahavy, Tom [2 ]
Singh, Satinder [2 ]
机构
[1] Univ Oxford, Oxford, England
[2] DeepMind, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta-gradient methods (Xu et al., 2018; Zahavy et al., 2020) offer a promising solution to the problem of hyperparameter selection and adaptation in non-stationary reinforcement learning problems. However, the properties of meta-gradients in such environments have not been systematically studied. In this work, we bring new clarity to meta-gradients in non-stationary environments. Concretely, we ask: (i) how much information should be given to the learned optimizers, so as to enable faster adaptation and generalization over a lifetime, (ii) what meta-optimizer functions are learned in this process, and (iii) whether meta-gradient methods provide a bigger advantage in highly non-stationary environments. To study the effect of information provided to the meta-optimizer, as in recent works (Flennerhag et al., 2022; Almeida et al., 2021), we replace the tuned meta-parameters of fixed update rules with learned meta-parameter functions of selected context features. The context features carry information about agent performance and changes in the environment and hence can inform learned meta-parameter schedules. We find that adding more contextual information is generally beneficial, leading to faster adaptation of meta-parameter values and increased performance. We support these results with a qualitative analysis of resulting meta-parameter schedules and learned functions of context features. Lastly, we find that without context, meta-gradients do not provide a consistent advantage over the baseline in highly non-stationary environments. Our findings suggest that contextualising meta-gradients can play a pivotal role in extracting high performance from meta-gradients in non-stationary settings.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Towards Reinforcement Learning for Non-stationary Environments
    Dal Toe, Sebastian Gregory
    Tiddeman, Bernard
    Mac Parthalain, Neil
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 41 - 52
  • [22] Adaptive Tracking Techniques in Non-Stationary Environments
    Pousinho, Andre
    Toledo, Manuel
    Ferreira, Teresa
    Lopez-Salcedo, Jose A.
    Locubiche-Serra, Sergi
    Seco-Granados, Gonzalo
    Ribot, Miguel Angel
    Jovanovic, Aleksandar
    Botteron, Cyril
    Farine, P. -A.
    Ioannides, Rigas
    PROCEEDINGS OF THE 28TH INTERNATIONAL TECHNICAL MEETING OF THE SATELLITE DIVISION OF THE INSTITUTE OF NAVIGATION (ION GNSS+ 2015), 2015, : 3107 - 3115
  • [23] Homogenization of random diffusions in non-stationary environments
    Boryc, Marcin
    Komorowski, Tomasz
    ASYMPTOTIC ANALYSIS, 2014, 90 (1-2) : 1 - 20
  • [24] Weighted Linear Bandits for Non-Stationary Environments
    Russac, Yoan
    Vernade, Claire
    Cappe, Olivier
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [25] Supporting Sensor Orchestration in Non-Stationary Environments
    Holst, Christoph-Alexander
    Lohweg, Volker
    2018 ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, 2018, : 363 - 370
  • [26] Double Meta-Learning for Data Efficient Policy Optimization in Non-Stationary Environments
    Aghapour, Elahe
    Ayanian, Nora
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 9935 - 9942
  • [27] A Unified Channel Estimation Framework for Stationary and Non-Stationary Fading Environments
    Shi, Qi
    Liu, Yangyu
    Zhang, Shunqing
    Xu, Shugong
    Lau, Vincent K. N.
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (07) : 4937 - 4952
  • [28] Fundamental Limits of Age-of-Information in Stationary and Non-stationary Environments
    Banerjee, Subhankar
    Bhattacharjee, Rajarshi
    Sinha, Abhishek
    2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 1741 - 1746
  • [29] A robust incremental learning method for non-stationary environments
    Martinez-Rego, David
    Perez-Sanchez, Beatriz
    Fontenla-Romero, Oscar
    Alonso-Betanzos, Amparo
    NEUROCOMPUTING, 2011, 74 (11) : 1800 - 1808
  • [30] Learning Optimal Behavior in Environments with Non-stationary Observations
    Boone, Ilio
    Rens, Gavin
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 729 - 736