Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

被引:0
|
作者
Huang, Wenzhen [1 ,2 ]
Yin, Qiyue [1 ,2 ]
Zhang, Junge [1 ,2 ]
Huang, Kaiqi [1 ,2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, CRISE, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model-based reinforcement learning (RL) is more sample efficient than model-free RL by using imaginary trajectories generated by the learned dynamics model. When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions. To alleviate such problem, this paper proposes to adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories. More specifically, we evaluate the effect of an imaginary transition by calculating the change of the loss computed on the real samples when we use the transition to train the action-value and policy functions. Based on this evaluation criterion, we construct the idea of reweighting each imaginary transition by a well-designed meta-gradient algorithm. Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. Visualization of our changing weights further validates the necessity of utilizing reweight scheme.
引用
收藏
页码:7848 / 7856
页数:9
相关论文
共 50 条
  • [31] Online Constrained Model-based Reinforcement Learning
    van Niekerk, Benjamin
    Damianou, Andreas
    Rosman, Benjamin
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [32] Calibrated Model-Based Deep Reinforcement Learning
    Malik, Ali
    Kuleshov, Volodymyr
    Song, Jiaming
    Nemer, Danny
    Seymour, Harlan
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [33] Incremental model-based reinforcement learning with model constraint
    Yang, Zhiyou
    Fu, Mingsheng
    Qu, Hong
    Li, Fan
    Shi, Shuqing
    Hu, Wang
    NEURAL NETWORKS, 2025, 185
  • [34] Learning to Attack Federated Learning: A Model-based Reinforcement Learning Attack Framework
    Li, Henger
    Sun, Xiaolin
    Zheng, Zizhan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [35] Skill-based Model-based Reinforcement Learning
    Shi, Lucy Xiaoyang
    Lim, Joseph J.
    Lee, Youngwoon
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2262 - 2272
  • [36] Model-Based Reinforcement Learning for Quantized Federated Learning Performance Optimization
    Yang, Nuocheng
    Wang, Sihua
    Chen, Mingzhe
    Brinton, Christopher G.
    Yin, Changchuan
    Saad, Walid
    Cui, Shuguang
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5063 - 5068
  • [37] Model-based reinforcement learning by pyramidal neurons: Robustness of the learning rule
    Eisele, M
    Sejnowski, T
    PROCEEDINGS OF THE 4TH JOINT SYMPOSIUM ON NEURAL COMPUTATION, VOL 7, 1997, : 83 - 90
  • [38] Weighted model estimation for offline model-based reinforcement learning
    Hishinuma, Toru
    Senda, Kei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [39] Model-based reinforcement learning: a computational model and an fMRI study
    Yoshida, W
    Ishii, S
    NEUROCOMPUTING, 2005, 63 : 253 - 269
  • [40] Latent Causal Dynamics Model for Model-Based Reinforcement Learning
    Hao, Zhifeng
    Zhu, Haipeng
    Chen, Wei
    Cai, Ruichu
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 219 - 230