Momentum in Reinforcement Learning

被引:0
|
作者
Vieillard, Nino [1 ,2 ]
Scherrer, Bruno [2 ]
Pietquin, Olivier [1 ]
Geist, Matthieu [1 ]
机构
[1] Google Res, Brain Team, Mountain View, CA 94043 USA
[2] Univ Lorraine, CNRS, INRIA, IECL, F-54000 Nancy, France
关键词
ENVIRONMENT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive q-functions. We derive Momentum Value Iteration (MoVI), a variation of Value iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically,we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A Survey on Reinforcement Learning and Deep Reinforcement Learning for Recommender Systems
    Rezaei, Mehrdad
    Tabrizi, Nasseh
    DEEP LEARNING THEORY AND APPLICATIONS, DELTA 2023, 2023, 1875 : 385 - 402
  • [22] Bridging Reinforcement Learning and Creativity: Implementing Reinforcement Learning in Processing
    Luo, Jieliang
    Green, Sam
    SA'18: SIGGRAPH ASIA 2018 COURSES, 2018,
  • [23] Two Steps Reinforcement Learning in Continuous Reinforcement Learning Tasks
    Lopez-Bueno, Ivan
    Garcia, Javier
    Fernandez, Fernando
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 577 - 584
  • [24] Curriculum Learning in Reinforcement Learning
    Narvekar, Sanmit
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1528 - 1529
  • [25] Curriculum Learning in Reinforcement Learning
    Narvekar, Sanmit
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 5195 - 5196
  • [26] Observational Learning by Reinforcement Learning
    Borsa, Diana
    Heess, Nicolas
    Piot, Bilal
    Liu, Siqi
    Hasenclever, Leonard
    Munos, Remi
    Pietquin, Olivier
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1117 - 1124
  • [27] Learning Pessimism for Reinforcement Learning
    Cetin, Edoardo
    Celiktutan, Oya
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6971 - 6979
  • [28] Behavioral momentum: The effects of the temporal separation of rates of reinforcement
    Cohen, SL
    JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR, 1998, 69 (01) : 29 - 47
  • [29] BEHAVIORAL MOMENTUM AND THE PARTIAL-REINFORCEMENT EXTINCTION EFFECT
    NEVIN, JA
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1985, 23 (04) : 280 - 280
  • [30] Behavioral momentum - Implications and development from reinforcement theories
    Plaud, JJ
    Gaither, GA
    BEHAVIOR MODIFICATION, 1996, 20 (02) : 183 - 201