Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning

被引:0
|
作者
Lee, Young Jae [1 ]
Kim, Jaehoon [1 ]
Park, Young Joon [2 ]
Kwak, Mingu [3 ]
Kim, Seoung Bum [1 ]
机构
[1] Korea Univ, Dept Ind & Management Engn, Seoul 02841, South Korea
[2] LG AI Res, Seoul 07796, South Korea
[3] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA 30332 USA
基金
新加坡国家研究基金会;
关键词
Data models; Data augmentation; Transformers; Task analysis; Representation learning; Predictive models; Inverse problems; Data-efficient reinforcement learning; inverse dynamics modeling; masked modeling; self-supervised multitask learning; transformer;
D O I
10.1109/TNNLS.2024.3439261
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In pixel-based deep reinforcement learning (DRL), learning representations of states that change because of an agent's action or interaction with the environment poses a critical challenge in improving data efficiency. Recent data-efficient DRL studies have integrated DRL with self-supervised learning (SSL) and data augmentation to learn state representations from given interactions. However, some methods have difficulties in explicitly capturing evolving state representations or in selecting data augmentations for appropriate reward signals. Our goal is to explicitly learn the inherent dynamics that change with an agent's intervention and interaction with the environment. We propose masked and inverse dynamics modeling (MIND), which uses masking augmentation and fewer hyperparameters to learn agent-controllable representations in changing states. Our method is comprised of a self-supervised multitask learning that leverages a transformer architecture, which captures the spatiotemporal information underlying in the highly correlated consecutive frames. MIND uses two tasks to perform self-supervised multitask learning: masked modeling and inverse dynamics modeling. Masked modeling learns the static visual representation required for control in the state, and inverse dynamics modeling learns the rapidly evolving state representation with agent intervention. By integrating inverse dynamics modeling as a complementary component to masked modeling, our method effectively learns evolving state representations. We evaluate our method by using discrete and continuous control environments with limited interactions. MIND outperforms previous methods across benchmarks and significantly improves data efficiency. The code is available at https://github.com/dudwojae/MIND.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Cascaded Gaussian Processes for Data-efficient Robot Dynamics Learning
    Rezaei-Shoshtari, Sahand
    Meger, David
    Sharf, Inna
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 6871 - 6877
  • [32] Data-Efficient Graph Learning
    Ding, Kaize
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22663 - 22663
  • [33] Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning
    Luck, Kevin Sebastian
    Ben Amor, Heni
    Calandra, Roberto
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [34] Mix-up Consistent Cross Representations for Data-Efficient Reinforcement Learning
    Liu, Shiyu
    Cao, Guitao
    Liu, Yong
    Li, Yan
    Wu, Chunwei
    Xi, Xidong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [35] Data-efficient deep reinforcement learning with expert demonstration for active flow control
    Zheng, Changdong
    Xie, Fangfang
    Ji, Tingwei
    Zhang, Xinshuai
    Lu, Yufeng
    Zhou, Hongjie
    Zheng, Yao
    PHYSICS OF FLUIDS, 2022, 34 (11)
  • [36] Data-Efficient Reinforcement Learning for Energy Optimization of Power-Assisted Wheelchairs
    Feng, Guoxi
    Busoniu, Lucian
    Guerra, Thierry-Marie
    Mohammad, Sami
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9734 - 9744
  • [37] Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning
    Zhong, Rujie
    Zhang, Duohan
    Schafer, Lukas
    Albrecht, Stefano V.
    Hanna, Josiah P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] Load Balancing for Communication Networks via Data-Efficient Deep Reinforcement Learning
    Wu, Di
    Kang, Jikun
    Xu, Yi Tian
    Li, Hang
    Li, Jimmy
    Chen, Xi
    Rivkin, Dmitriy
    Jenkin, Michael
    Lee, Taeseop
    Park, Intaik
    Liu, Xue
    Dudek, Gregory
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [39] SDRL: Interpretable and Data-Efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2970 - 2977
  • [40] SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning
    Lyu, Daoming
    Yang, Fangkai
    Liu, Bo
    Gustafson, Steven
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2019, (306): : 354 - 354