Momentum in Reinforcement Learning

被引:0
|
作者
Vieillard, Nino [1 ,2 ]
Scherrer, Bruno [2 ]
Pietquin, Olivier [1 ]
Geist, Matthieu [1 ]
机构
[1] Google Res, Brain Team, Mountain View, CA 94043 USA
[2] Univ Lorraine, CNRS, INRIA, IECL, F-54000 Nancy, France
关键词
ENVIRONMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive q-functions. We derive Momentum Value Iteration (MoVI), a variation of Value iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically,we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Morales, Eduardo F.
    Murrieta-Cid, Rafael
    Becerra, Israel
    Esquivel-Basaldua, Marco A.
    INTELLIGENT SERVICE ROBOTICS, 2021, 14 (05) : 773 - 805
  • [32] Reinforcement learning in learning automata and cellular learning automata via multiple reinforcement signals
    Vafashoar, Reza
    Meybodi, Mohammad Reza
    KNOWLEDGE-BASED SYSTEMS, 2019, 169 : 1 - 27
  • [33] A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
    Eduardo F. Morales
    Rafael Murrieta-Cid
    Israel Becerra
    Marco A. Esquivel-Basaldua
    Intelligent Service Robotics, 2021, 14 : 773 - 805
  • [34] Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2
    Peysakhovich, Alexander
    AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 409 - 415
  • [35] Learning to Label with Active Learning and Reinforcement Learning
    Tang, Xiu
    Wu, Sai
    Chen, Gang
    Chen, Ke
    Shou, Lidan
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 549 - 557
  • [36] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Horie, Naoto
    Matsui, Tohgoroh
    Moriyama, Koichi
    Mutoh, Atsuko
    Inuzuka, Nobuhiro
    ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
  • [37] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Naoto Horie
    Tohgoroh Matsui
    Koichi Moriyama
    Atsuko Mutoh
    Nobuhiro Inuzuka
    Artificial Life and Robotics, 2019, 24 : 352 - 359
  • [38] Reinforcement Learning for Blackjack
    Kakvi, Saqib A.
    ENTERTAINMENT COMPUTING - ICEC 2009, 2009, 5709 : 300 - 301
  • [39] Contextual Reinforcement Learning
    Langford, John
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3 - 3
  • [40] Why Reinforcement Learning?
    Aydin, Mehmet Emin
    Durgut, Rafet
    Rakib, Abdur
    ALGORITHMS, 2024, 17 (06)