ODE-based Recurrent Model-free Reinforcement Learning for POMDPs

被引:0
|
作者
Zhao, Xuanle [1 ,2 ]
Zhang, Duzhen [1 ,2 ]
Han, Liyuan [1 ,2 ]
Zhang, Tielin [1 ,2 ]
Xu, Bo [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural ordinary differential equations (ODEs) are widely recognized as the standard for modeling physical mechanisms, which help to perform approximate inference in unknown physical or biological environments. In partially observable (PO) environments, how to infer unseen information from raw observations puzzled the agents. By using a recurrent policy with a compact context, context-based reinforcement learning provides a flexible way to extract unobservable information from historical transitions. To help the agent extract more dynamics-related information, we present a novel ODE-based recurrent model combined with model-free reinforcement learning (RL) framework to solve partially observable Markov decision processes (POMDPs). We experimentally demonstrate the efficacy of our methods across various PO continuous control and meta-RL tasks. Furthermore, our experiments illustrate that our method is robust against irregular observations, owing to the ability of ODEs to model irregularly-sampled time series.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Model-Free Recurrent Reinforcement Learning for AUV Horizontal Control
    Huo, Yujia
    Li, Yiping
    Feng, Xisheng
    3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428
  • [2] Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
    Ni, Tianwei
    Eysenbach, Benjamin
    Salakhutdinov, Ruslan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [3] A scalable model-free recurrent neural network framework for solving POMDPs
    Liu, Zhenzhen
    Elhanany, Itamar
    2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 119 - +
  • [4] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [5] Model-based and Model-free Reinforcement Learning for Visual Servoing
    Farahmand, Amir Massoud
    Shademan, Azad
    Jagersand, Martin
    Szepesvari, Csaba
    ICRA: 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-7, 2009, : 4135 - 4142
  • [6] Model-Free Control for Soft Manipulators based on Reinforcement Learning
    You, Xuanke
    Zhang, Yixiao
    Chen, Xiaotong
    Liu, Xinghua
    Wang, Zhanchi
    Jiang, Hao
    Chen, Xiaoping
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2909 - 2915
  • [7] Model-Free Emergency Frequency Control Based on Reinforcement Learning
    Chen, Chunyu
    Cui, Mingjian
    Li, Fangxing
    Yin, Shengfei
    Wang, Xinan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (04) : 2336 - 2346
  • [8] Model-free Control for Stratospheric Airship Based on Reinforcement Learning
    Nie, Chunyu
    Zhu, Ming
    Zheng, Zewei
    Wu, Zhe
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 10702 - 10707
  • [9] Learning Representations in Model-Free Hierarchical Reinforcement Learning
    Rafati, Jacob
    Noelle, David C.
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
  • [10] Model-Free Trajectory Optimization for Reinforcement Learning
    Akrour, Riad
    Abdolmaleki, Abbas
    Abdulsamad, Hany
    Neumann, Gerhard
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48