Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

被引:0
|
作者
Huang, Wenzhen [1 ,2 ]
Yin, Qiyue [1 ,2 ]
Zhang, Junge [1 ,2 ]
Huang, Kaiqi [1 ,2 ,3 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, CRISE, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model-based reinforcement learning (RL) is more sample efficient than model-free RL by using imaginary trajectories generated by the learned dynamics model. When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions. To alleviate such problem, this paper proposes to adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories. More specifically, we evaluate the effect of an imaginary transition by calculating the change of the loss computed on the real samples when we use the transition to train the action-value and policy functions. Based on this evaluation criterion, we construct the idea of reweighting each imaginary transition by a well-designed meta-gradient algorithm. Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. Visualization of our changing weights further validates the necessity of utilizing reweight scheme.
引用
收藏
页码:7848 / 7856
页数:9
相关论文
共 50 条
  • [21] Adaptive Discretization for Model-Based Reinforcement Learning
    Sinclair, Sean R.
    Wang, Tianyu
    Jain, Gauri
    Banerjee, Siddhartha
    Yu, Christina Lee
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [22] Model-based average reward reinforcement learning
    Tadepalli, P
    Ok, D
    ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 177 - 224
  • [23] Continual Model-Based Reinforcement Learning with Hypernetworks
    Huang, Yizhou
    Xie, Kevin
    Bharadhwaj, Homanga
    Shkurti, Florian
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 799 - 805
  • [24] Model-based Reinforcement Learning and the Eluder Dimension
    Osband, Ian
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [25] Model-Based Reinforcement Learning in Robotics: A Survey
    Sun S.
    Lan X.
    Zhang H.
    Zheng N.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (01): : 1 - 16
  • [26] MOReL: Model-Based Offline Reinforcement Learning
    Kidambi, Rahul
    Rajeswaran, Aravind
    Netrapalli, Praneeth
    Joachims, Thorsten
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [27] Model-Based Reinforcement Learning With Isolated Imaginations
    Pan, Minting
    Zhu, Xiangming
    Zheng, Yitao
    Wang, Yunbo
    Yang, Xiaokang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2788 - 2803
  • [28] Consistency of Fuzzy Model-Based Reinforcement Learning
    Busoniu, Lucian
    Ernst, Damien
    De Schutter, Bart
    Babuska, Robert
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 518 - +
  • [29] Asynchronous Methods for Model-Based Reinforcement Learning
    Zhang, Yunzhi
    Clavera, Ignasi
    Tsai, Boren
    Abbeel, Pieter
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [30] Abstraction Selection in Model-Based Reinforcement Learning
    Jiang, Nan
    Kulesza, Alex
    Singh, Satinder
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 179 - 188