Prediction and Control in Continual Reinforcement Learning

被引:0
|
作者
Anand, Nishanth [1 ,2 ]
Precup, Doina [1 ,3 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ, Canada
[2] Mila, Milan, Italy
[3] Deepmind, London, England
基金
加拿大自然科学与工程研究理事会;
关键词
GAME; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a permanent value function, which holds general knowledge that persists over time, and a transient value function, which allows quick adaptation to new situations. We establish theoretical results showing that our approach is well suited for continual learning and draw connections to the complementary learning systems (CLS) theory from neuroscience. Empirically, this approach improves performance significantly on both prediction and control problems.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] Continual World: A Robotic Benchmark For Continual Reinforcement Learning
    Wolczyk, Maciej
    Zajac, Michal
    Pascanu, Razvan
    Kucinski, Lukasz
    Milos, Piotr
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Continual Reinforcement Learning with Complex Synapses
    Kaplanis, Christos
    Shanahan, Murray
    Clopath, Claudia
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [3] Disentangling Transfer in Continual Reinforcement Learning
    Wolczyk, Maciej
    Zajac, Michal
    Pascanu, Razvan
    Kucinski, Lukasz
    Milos, Piotr
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [4] Adaptive Exploration for Continual Reinforcement Learning
    Stulp, Freek
    2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 1631 - 1636
  • [5] Policy Consolidation for Continual Reinforcement Learning
    Kaplanis, Christos
    Shanahan, Murray
    Clopath, Claudia
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] Learning to Navigate for Mobile Robot with Continual Reinforcement Learning
    Wang, Ning
    Zhang, Dingyuan
    Wang, Yong
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 3701 - 3706
  • [7] THE EFFECTIVENESS OF WORLD MODELS FOR CONTINUAL REINFORCEMENT LEARNING
    Kessler, Samuel
    Ostaszewski, Mateusz
    Bortkiewicz, Michal
    Zarski, Mateusz
    Wolczyk, Maciej
    Parker-Holder, Jack
    Roberts, Stephen J.
    Milos, Piotr
    CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 184 - 204
  • [8] Towards Continual Reinforcement Learning: A Review and Perspectives
    Khetarpal, Khimya
    Riemer, Matthew
    Rish, Irina
    Precup, Doina
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 75 : 1401 - 1476
  • [9] COOM: A Game Benchmark for Continual Reinforcement Learning
    Tomilin, Tristan
    Fang, Meng
    Zhang, Yudi
    Pechenizkiy, Mykola
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Avalanche RL: A Continual Reinforcement Learning Library
    Lucchesi, Nicolo
    Carta, Antonio
    Lomonaco, Vincenzo
    Bacciu, Davide
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT I, 2022, 13231 : 524 - 535