Meta-Reinforcement Learning in Nonstationary and Nonparametric Environments

被引:8
|
作者
Bing, Zhenshan [1 ]
Knak, Lukas [1 ]
Cheng, Long [2 ]
Morin, Fabrice O. [1 ]
Huang, Kai [3 ]
Knoll, Alois [1 ]
机构
[1] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany
[2] Wenzhou Univ, Coll Comp Sci & Artificial Intelligence, Wenzhou 325035, Peoples R China
[3] Sun Yat sen Univ, Sch Data & Comp Sci, Guangzhou 543000, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Training; Adaptation models; Robots; Probabilistic logic; Turning; Switches; Gaussian variational autoencoder (VAE); meta-reinforcement learning (meta-RL); robotic control; task adaptation; task inference;
D O I
10.1109/TNNLS.2023.3270298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent state-of-the-art artificial agents lack the ability to adapt rapidly to new tasks, as they are trained exclusively for specific objectives and require massive amounts of interaction to learn new skills. Meta-reinforcement learning (meta-RL) addresses this challenge by leveraging knowledge learned from training tasks to perform well in previously unseen tasks. However, current meta-RL approaches limit themselves to narrow parametric and stationary task distributions, ignoring qualitative differences and nonstationary changes between tasks that occur in the real world. In this article, we introduce a Task-Inference-based meta-RL algorithm using explicitly parameterized Gaussian variational autoencoders (VAEs) and gated Recurrent units (TIGR), designed for nonparametric and nonstationary environments. We employ a generative model involving a VAE to capture the multimodality of the tasks. We decouple the policy training from the task-inference learning and efficiently train the inference mechanism on the basis of an unsupervised reconstruction objective. We establish a zero-shot adaptation procedure to enable the agent to adapt to nonstationary task changes. We provide a benchmark with qualitatively distinct tasks based on the half-cheetah environment and demonstrate the superior performance of TIGR compared with state-of-the-art meta-RL approaches in terms of sample efficiency (three to ten times faster), asymptotic performance, and applicability in nonparametric and nonstationary environments with zero-shot adaptation. Videos can be viewed at https://videoviewsite.wixsite.com/tigr.
引用
收藏
页码:13604 / 13618
页数:15
相关论文
共 50 条
  • [41] Transformers are Meta-Reinforcement Learners
    Melo, Luckeciano C.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [42] Multiagent Meta-Reinforcement Learning for Adaptive Multipath Routing Optimization
    Chen, Long
    Hu, Bin
    Guan, Zhi-Hong
    Zhao, Lian
    Shen, Xuemin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (10) : 5374 - 5386
  • [43] Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference
    Chen, Jinhao
    Zhang, Chunhong
    Hu, Zheng
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 223 - 234
  • [44] Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks
    Hu, Ye
    Chen, Mingzhe
    Saad, Walid
    Poor, H. Vincent
    Cui, Shuguang
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [45] Decoupling Meta-Reinforcement Learning with Gaussian Task Contexts and Skills
    He, Hongcai
    Zhu, Anjie
    Liang, Shuang
    Chen, Feiyu
    Shao, Jie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12358 - 12366
  • [46] Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
    Liu, Evan Zheran
    Raghunathan, Aditi
    Liang, Percy
    Finn, Chelsea
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [47] Exploring Parameter Space with Structured Noise for Meta-Reinforcement Learning
    Xu, Hui
    Zhang, Chong
    Wang, Jiaxing
    Ouyang, Deqiang
    Zheng, Yu
    Shao, Jie
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3153 - 3159
  • [48] MetaRLEC: Meta-Reinforcement Learning for Discovery of Brain Effective Connectivity
    Zhang, Zuozhen
    Ji, Junzhong
    Liu, Jinduo
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10261 - 10269
  • [49] Meta-reinforcement learning for the tuning of PI controllers: An offline approach
    McClement, Daniel G.
    Lawrence, Nathan P.
    Backstroem, Johan U.
    Loewen, Philip D.
    Forbes, Michael G.
    Gopaluni, R. Bhushan
    JOURNAL OF PROCESS CONTROL, 2022, 118 : 139 - 152
  • [50] Model-Based Meta-reinforcement Learning for Hyperparameter Optimization
    Albrechts, Jeroen
    Martin, Hugo M.
    Tavakol, Maryam
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2024, PT I, 2025, 15346 : 27 - 39