ODE-based Recurrent Model-free Reinforcement Learning for POMDPs

被引：0

作者：

Zhao, Xuanle ^{[1
,2
]}

Zhang, Duzhen ^{[1
,2
]}

Han, Liyuan ^{[1
,2
]}

Zhang, Tielin ^{[1
,2
]}

Xu, Bo ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

[3] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural ordinary differential equations (ODEs) are widely recognized as the standard for modeling physical mechanisms, which help to perform approximate inference in unknown physical or biological environments. In partially observable (PO) environments, how to infer unseen information from raw observations puzzled the agents. By using a recurrent policy with a compact context, context-based reinforcement learning provides a flexible way to extract unobservable information from historical transitions. To help the agent extract more dynamics-related information, we present a novel ODE-based recurrent model combined with model-free reinforcement learning (RL) framework to solve partially observable Markov decision processes (POMDPs). We experimentally demonstrate the efficacy of our methods across various PO continuous control and meta-RL tasks. Furthermore, our experiments illustrate that our method is robust against irregular observations, owing to the ability of ODEs to model irregularly-sampled time series.

引用

页数：17

共 50 条

[1] Model-Free Recurrent Reinforcement Learning for AUV Horizontal Control
Huo, Yujia
Li, Yiping
Feng, Xisheng
3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428
[2] Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs
Ni, Tianwei
Eysenbach, Benjamin
Salakhutdinov, Ruslan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[3] A scalable model-free recurrent neural network framework for solving POMDPs
Liu, Zhenzhen
Elhanany, Itamar
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 119 - +
[4] Model-Free Preference-Based Reinforcement Learning
Wirth, Christian
Fuernkranz, Johannes
Neumann, Gerhard
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
[5] Model-based and Model-free Reinforcement Learning for Visual Servoing
Farahmand, Amir Massoud
Shademan, Azad
Jagersand, Martin
Szepesvari, Csaba
ICRA: 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-7, 2009, : 4135 - 4142
[6] Model-Free Control for Soft Manipulators based on Reinforcement Learning
You, Xuanke
Zhang, Yixiao
Chen, Xiaotong
Liu, Xinghua
Wang, Zhanchi
Jiang, Hao
Chen, Xiaoping
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2909 - 2915
[7] Model-Free Emergency Frequency Control Based on Reinforcement Learning
Chen, Chunyu
Cui, Mingjian
Li, Fangxing
Yin, Shengfei
Wang, Xinan
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (04) : 2336 - 2346
[8] Model-free Control for Stratospheric Airship Based on Reinforcement Learning
Nie, Chunyu
Zhu, Ming
Zheng, Zewei
Wu, Zhe
PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 10702 - 10707
[9] Learning Representations in Model-Free Hierarchical Reinforcement Learning
Rafati, Jacob
Noelle, David C.
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
[10] Model-Free Trajectory Optimization for Reinforcement Learning
Akrour, Riad
Abdolmaleki, Abbas
Abdulsamad, Hany
Neumann, Gerhard
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48

← 1 2 3 4 5 →