Meta-Reinforcement Learning by Tracking Task Non-stationarity

被引:0
|
作者
Poiani, Riccardo [1 ]
Tirinzoni, Andrea [2 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] Inria Lille, Lille, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real-world domains are subject to a structured non-stationarity which affects the agent's goals and the environmental dynamics. Meta-reinforcement learning (RL) has been shown successful for training agents that quickly adapt to related tasks. However, most of the existing meta-RL algorithms for non-stationary domains either make strong assumptions on the task generation process or require sampling from it at training time. In this paper, we propose a novel algorithm (TRIO) that optimizes for the future by explicitly tracking the task evolution through time. At training time, TRIO learns a variational module to quickly identify latent parameters from experience samples. This module is learned jointly with an optimal exploration policy that takes task uncertainty into account. At test time, TRIO tracks the evolution of the latent parameters online, hence reducing the uncertainty over future tasks and obtaining fast adaptation through the meta-learned policy. Unlike most existing methods, TRIO does not assume Markovian task-evolution processes, it does not require information about the non-stationarity at training time, and it captures complex changes undergoing in the environment. We evaluate our algorithm on different simulated problems and show it outperforms competitive baselines.
引用
收藏
页码:2899 / 2905
页数:7
相关论文
共 50 条
  • [21] On non-stationarity of ENSO
    Solow, AR
    Huppert, A
    GEOPHYSICAL RESEARCH LETTERS, 2003, 30 (17)
  • [22] ASPECTS OF NON-STATIONARITY
    SINGER, B
    JOURNAL OF ECONOMETRICS, 1982, 18 (01) : 169 - 190
  • [23] Prefrontal cortex as a meta-reinforcement learning system
    Jane X. Wang
    Zeb Kurth-Nelson
    Dharshan Kumaran
    Dhruva Tirumala
    Hubert Soyer
    Joel Z. Leibo
    Demis Hassabis
    Matthew Botvinick
    Nature Neuroscience, 2018, 21 : 860 - 868
  • [24] Offline Meta-Reinforcement Learning for Industrial Insertion
    Zhao, Tony Z.
    Luo, Jianlan
    Sushkov, Oleg
    Pevceviciute, Rugile
    Heess, Nicolas
    Scholz, Jon
    Schaal, Stefan
    Levine, Sergey
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 6386 - 6393
  • [25] A Meta-Reinforcement Learning Approach to Process Control
    McClement, Daniel G.
    Lawrence, Nathan P.
    Loewen, Philip D.
    Forbes, Michael G.
    Backstrom, Johan U.
    Gopaluni, R. Bhushan
    IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
  • [26] Non-stationarity Detection in Model-Free Reinforcement Learning via Value Function Monitoring
    Hussein, Maryem
    Keshk, Marwa
    Hussein, Aya
    ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II, 2024, 14472 : 350 - 362
  • [27] Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity
    Mao, Weichao
    Qiu, Haoran
    Wang, Chen
    Franke, Hubertus
    Kalbarczyk, Zbigniew
    Iyer, Ravishankar K.
    Basar, Tamer
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Unsupervised Curricula for Visual Meta-Reinforcement Learning
    Jabri, Allan
    Hsu, Kyle
    Eysenbach, Benjamin
    Gupta, Abhishek
    Levine, Sergey
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [29] Meta-Reinforcement Learning of Structured Exploration Strategies
    Gupta, Abhishek
    Mendonca, Russell
    Liu, YuXuan
    Abbeel, Pieter
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [30] Off-Policy Meta-Reinforcement Learning With Belief-Based Task Inference
    Imagawa, Takahisa
    Hiraoka, Takuya
    Tsuruoka, Yoshimasa
    IEEE ACCESS, 2022, 10 : 49494 - 49507