Momentum in Reinforcement Learning

被引:0
|
作者
Vieillard, Nino [1 ,2 ]
Scherrer, Bruno [2 ]
Pietquin, Olivier [1 ]
Geist, Matthieu [1 ]
机构
[1] Google Res, Brain Team, Mountain View, CA 94043 USA
[2] Univ Lorraine, CNRS, INRIA, IECL, F-54000 Nancy, France
关键词
ENVIRONMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive q-functions. We derive Momentum Value Iteration (MoVI), a variation of Value iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically,we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Reinforcement learning: a survey
    Kaelbling, Leslie Pack
    Littman, Michael L.
    Moore, Andrew W.
    Journal of Artificial Intelligence Research, 1996, 4 : 237 - 285
  • [42] REINFORCEMENT, EXPECTANCY, AND LEARNING
    BOLLES, RC
    PSYCHOLOGICAL REVIEW, 1972, 79 (05) : 394 - &
  • [43] On the convergence of reinforcement learning
    Beggs, AW
    JOURNAL OF ECONOMIC THEORY, 2005, 122 (01) : 1 - 36
  • [44] Munchausen Reinforcement Learning
    Vieillard, Nino
    Pietquin, Olivier
    Geist, Matthieu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [45] Filtered reinforcement learning
    Aberdeen, D
    MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 27 - 38
  • [46] Constructive reinforcement learning
    Hernandez-Orallo, J
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2000, 15 (03) : 241 - 264
  • [47] Offline Reinforcement Learning with Pseudometric Learning
    Dadashi, Robert
    Rezaeifar, Shideh
    Vieillard, Nino
    Hussenot, Leonard
    Pietquin, Olivier
    Geist, Matthieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [48] Relational reinforcement learning
    Driessens, K
    AI COMMUNICATIONS, 2005, 18 (01) : 71 - 73
  • [49] Reinforcement landmark learning
    Toombs, SP
    Phillips, WA
    Smith, LS
    FROM ANIMALS TO ANIMATS 5, 1998, : 205 - 212
  • [50] Reinforcement learning with Marr
    Niv, Yael
    Langdon, Angela
    CURRENT OPINION IN BEHAVIORAL SCIENCES, 2016, 11 : 67 - 73