Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

被引：0

作者：

Huang, Wenzhen ^{[1
,2
]}

Yin, Qiyue ^{[1
,2
]}

Zhang, Junge ^{[1
,2
]}

Huang, Kaiqi ^{[1
,2
,3
]}

机构：

[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Automat, CRISE, Beijing, Peoples R China

[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China

来源：

THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2021年 / 35卷

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Model-based reinforcement learning (RL) is more sample efficient than model-free RL by using imaginary trajectories generated by the learned dynamics model. When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions. To alleviate such problem, this paper proposes to adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories. More specifically, we evaluate the effect of an imaginary transition by calculating the change of the loss computed on the real samples when we use the transition to train the action-value and policy functions. Based on this evaluation criterion, we construct the idea of reweighting each imaginary transition by a well-designed meta-gradient algorithm. Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. Visualization of our changing weights further validates the necessity of utilizing reweight scheme.

引用

页码：7848 / 7856

页数：9

共 50 条

[31] Online Constrained Model-based Reinforcement Learning
van Niekerk, Benjamin
Damianou, Andreas
Rosman, Benjamin
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
[32] Calibrated Model-Based Deep Reinforcement Learning
Malik, Ali
Kuleshov, Volodymyr
Song, Jiaming
Nemer, Danny
Seymour, Harlan
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[33] Incremental model-based reinforcement learning with model constraint
Yang, Zhiyou
Fu, Mingsheng
Qu, Hong
Li, Fan
Shi, Shuqing
Hu, Wang
NEURAL NETWORKS, 2025, 185
[34] Learning to Attack Federated Learning: A Model-based Reinforcement Learning Attack Framework
Li, Henger
Sun, Xiaolin
Zheng, Zizhan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[35] Skill-based Model-based Reinforcement Learning
Shi, Lucy Xiaoyang
Lim, Joseph J.
Lee, Youngwoon
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 2262 - 2272
[36] Model-Based Reinforcement Learning for Quantized Federated Learning Performance Optimization
Yang, Nuocheng
Wang, Sihua
Chen, Mingzhe
Brinton, Christopher G.
Yin, Changchuan
Saad, Walid
Cui, Shuguang
2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5063 - 5068
[37] Model-based reinforcement learning by pyramidal neurons: Robustness of the learning rule
Eisele, M
Sejnowski, T
PROCEEDINGS OF THE 4TH JOINT SYMPOSIUM ON NEURAL COMPUTATION, VOL 7, 1997, : 83 - 90
[38] Weighted model estimation for offline model-based reinforcement learning
Hishinuma, Toru
Senda, Kei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[39] Model-based reinforcement learning: a computational model and an fMRI study
Yoshida, W
Ishii, S
NEUROCOMPUTING, 2005, 63 : 253 - 269
[40] Latent Causal Dynamics Model for Model-Based Reinforcement Learning
Hao, Zhifeng
Zhu, Haipeng
Chen, Wei
Cai, Ruichu
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 219 - 230

← 1 2 3 4 5 →