DynaSTI: Dynamics modeling with sequential temporal information for reinforcement learning in Atari

被引:1
|
作者
Kim, Jaehoon [1 ]
Lee, Young Jae [1 ]
Kwak, Mingu [2 ]
Park, Young Joon [3 ]
Kim, Seoung Bum [1 ]
机构
[1] Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea
[2] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA USA
[3] LG AI Res, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Atari; Dynamics modeling; Hierarchical structure; Self-supervised learning; Reinforcement learning;
D O I
10.1016/j.knosys.2024.112103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (DRL) has shown remarkable capabilities in solving sequential decision -making problems. However, DRL requires extensive interactions with image -based environments. Existing methods have combined self -supervised learning or data augmentation to improve sample efficiency. While understanding the temporal information dynamics of the environment is important for effective learning, many methods do not consider these factors. To address the sample efficiency problem, we propose dynamics modeling with sequential temporal information (DynaSTI) that incorporates environmental dynamics and leverages the correlation among trajectories to improve sample efficiency. DynaSTI uses an effective learning strategy for state representation as an auxiliary task, using gated recurrent units to capture temporal information. It also integrates forward and inverse dynamics modeling in a hierarchical configuration, enhancing the learning of environmental dynamics compared to using each model separately. The hierarchical structure of DynaSTI enhances the stability of inverse dynamics modeling during training by using inputs derived from forward dynamics modeling, which focuses on feature extraction related to controllable state. This approach effectively filters out noisy information. Consequently, using denoised inputs from forward dynamics modeling results in improved stability when training inverse dynamics modeling, rather than using inputs directly from the encoder. We demonstrate the effectiveness of DynaSTI through experiments on the Atari game benchmark, limiting the environment interactions to 100k steps. Our extensive experiments confirm that DynaSTI significantly improves the sample efficiency of DRL, outperforming comparison methods in terms of statistically reliable metrics and nearing human -level performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Deep Reinforcement Learning for Sequential Targeting
    Wang, Wen
    Li, Beibei
    Luo, Xueming
    Wang, Xiaoyi
    MANAGEMENT SCIENCE, 2023, 69 (09) : 5439 - 5460
  • [32] Reinforcement Learning of Sequential Price Mechanisms
    Brero, Gianluca
    Eden, Alon
    Gerstgrasser, Matthias
    Parkes, David
    Rheingans-Yoo, Duncan
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 5219 - 5227
  • [33] Artificial Intelligence (AI) Prediction of Atari Game Strategy by using Reinforcement Learning Algorithms
    Kumar, Saravana U.
    Punitha, S.
    Perakam, Girish
    Palukuru, Vishnu Priya
    Raghavaraju, Jaswanth Varma
    Praveena, R.
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2021), 2021, : 536 - 539
  • [34] Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games
    Garcia-Ramirez, Jesus
    Morales, Eduardo F.
    Escalante, Hugo Jair
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (25): : 18099 - 18111
  • [35] Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning
    Aydin, Ayberk
    Surer, Elif
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [36] Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling
    Sosic, Adrian
    Zoubir, Abdelhak M.
    Rueckert, Elmar
    Peters, Jan
    Koeppl, Heinz
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19
  • [37] Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning
    Lee, Young Jae
    Kim, Jaehoon
    Park, Young Joon
    Kwak, Mingu
    Kim, Seoung Bum
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [38] On the Impact of Tangled Program Graph Marking Schemes under the Atari Reinforcement Learning Benchmark
    Ianta, Alexandru
    Amaral, Ryan
    Bayer, Caleidgh
    Smith, Robert J.
    Heywood, Malcolm, I
    PROCEEDINGS OF THE 2021 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'21), 2021, : 111 - 119
  • [39] Pre-trained Bert for Natural Language Guided Reinforcement Learning in Atari Game
    Li, Xin
    Zhang, Yu
    Luo, Junren
    Liu, Yifeng
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 5119 - 5124
  • [40] Distributed Deep Reinforcement Learning: Learn How to Play Atari Games in 21 minutes
    Adamski, Igor
    Adamski, Robert
    Grel, Tomasz
    Jedrych, Adam
    Kaczmarek, Kamil
    Michalewski, Henryk
    HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 10876 : 370 - 388