POMDPs in Continuous Time and Discrete Spaces

被引：0

作者：

Alt, Bastian ^{[1
]}

Schultheis, Matthias ^{[1
,2
]}

Koeppl, Heinz ^{[1
,2
]}

机构：

[1] Tech Univ Darmstadt, Dept Elect Engn & Informat Technol, Darmstadt, Germany

[2] Tech Univ Darmstadt, Ctr Cognit Sci, Darmstadt, Germany

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

引用

页数：12

共 50 条

[21] Online Parameter Estimation via Real-Time Replanning of Continuous Gaussian POMDPs
Webb, Dustin J.
Crandall, Kyle L.
van den Berg, Jur
2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 5998 - 6005
[22] SYSTEMS WITH CONTINUOUS TIME AND DISCRETE TIME COMPONENTS
Bacciotti, A.
GEOMETRIC CONTROL AND NONSMOOTH ANALYSIS, 2008, 76 : 82 - 99
[23] Using Continuous Action Spaces to Solve Discrete Problems
van Hasselt, Hado
Wiering, Marco A.
IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 1144 - +
[24] Superfluid density in continuous and discrete spaces: Avoiding misconceptions
Rousseau, V. G.
PHYSICAL REVIEW B, 2014, 90 (13)
[25] Asymptotic behavior of discrete and continuous semigroups on Hilbert spaces
Buse, Constantin
Prajea, Manuela-Suzy
BULLETIN MATHEMATIQUE DE LA SOCIETE DES SCIENCES MATHEMATIQUES DE ROUMANIE, 2008, 51 (02): : 123 - 135
[26] Probabilistic embedding of discrete sets as continuous metric spaces
Blanchard, Ph.
Volchenkov, D.
STOCHASTICS-AN INTERNATIONAL JOURNAL OF PROBABILITY AND STOCHASTIC PROCESSES, 2009, 81 (3-4) : 259 - 268
[27] Discrete–Continuous Jacobi–Sobolev Spaces and Fourier Series
Abel Díaz-González
Francisco Marcellán
Héctor Pijeira-Cabrera
Wilfredo Urbina
Bulletin of the Malaysian Mathematical Sciences Society, 2021, 44 : 571 - 598
[28] CONTINUOUS TIME REGRESSIONS WITH DISCRETE DATA
ROBINSON, PM
ANNALS OF STATISTICS, 1975, 3 (03): : 688 - 697
[29] Discrete Dividend Payments in Continuous Time
Keppo, Jussi
Reppen, A. Max
Soner, H. Mete
MATHEMATICS OF OPERATIONS RESEARCH, 2021, 46 (03) : 895 - 911
[30] FROM DISCRETE TO CONTINUOUS-TIME
KEISLER, HJ
ANNALS OF PURE AND APPLIED LOGIC, 1991, 52 (1-2) : 99 - 141

← 1 2 3 4 5 →