Model-Based Reinforcement Learning With Isolated Imaginations

被引:0
|
作者
Pan, Minting [1 ]
Zhu, Xiangming [1 ]
Zheng, Yitao [1 ]
Wang, Yunbo [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoupled dynamics; model-based reinforcement learning; world model;
D O I
10.1109/TPAMI.2023.3335263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios like autonomous driving, noncontrollable dynamics that are independent or sparsely dependent on action signals often exist, making it challenging to learn effective world models. To address this issue, we propose Iso-Dream++, a model-based reinforcement learning approach that has two main contributions. First, we optimize the inverse dynamics to encourage the world model to isolate controllable state transitions from the mixed spatiotemporal variations of the environment. Second, we perform policy optimization based on the decoupled latent imaginations, where we roll out noncontrollable states into the future and adaptively associate them with the current controllable state. This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild, such as self-driving cars that can anticipate the movement of other vehicles, thereby avoiding potential risks. On top of our previous work (Pan et al. 2022), we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups. Our empirical study demonstrates that Iso-Dream++ outperforms existing reinforcement learning models significantly on CARLA and DeepMind Control.
引用
收藏
页码:2788 / 2803
页数:16
相关论文
共 50 条
  • [31] Model gradient: unified model and policy learning in model-based reinforcement learning
    Jia, Chengxing
    Zhang, Fuxiang
    Xu, Tian
    Pang, Jing-Cheng
    Zhang, Zongzhang
    Yu, Yang
    FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (04)
  • [32] Model gradient: unified model and policy learning in model-based reinforcement learning
    Chengxing Jia
    Fuxiang Zhang
    Tian Xu
    Jing-Cheng Pang
    Zongzhang Zhang
    Yang Yu
    Frontiers of Computer Science, 2024, 18
  • [33] Incremental Learning of Planning Actions in Model-Based Reinforcement Learning
    Ng, Jun Hao Alvin
    Petrick, Ronald P. A.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3195 - 3201
  • [34] Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning
    Huang, Wenzhen
    Yin, Qiyue
    Zhang, Junge
    Huang, Kaiqi
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7848 - 7856
  • [35] Model-Based Transfer Reinforcement Learning Based on Graphical Model Representations
    Sun, Yuewen
    Zhang, Kun
    Sun, Changyin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (02) : 1035 - 1048
  • [36] Weighted model estimation for offline model-based reinforcement learning
    Hishinuma, Toru
    Senda, Kei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [37] Latent Causal Dynamics Model for Model-Based Reinforcement Learning
    Hao, Zhifeng
    Zhu, Haipeng
    Chen, Wei
    Cai, Ruichu
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 219 - 230
  • [38] Model-based reinforcement learning with model error and its application
    Tajima, Yoshiyuki
    Onisawa, Takehisa
    PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-8, 2007, : 1333 - 1336
  • [39] Model-based reinforcement learning: a computational model and an fMRI study
    Yoshida, W
    Ishii, S
    NEUROCOMPUTING, 2005, 63 : 253 - 269
  • [40] Reinforcement Twinning: From digital twins to model-based reinforcement learning
    Schena, Lorenzo
    Marques, Pedro A.
    Poletti, Romain
    Van den Berghe, Jan
    Mendez, Miguel A.
    JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 82