POMDPs in Continuous Time and Discrete Spaces

被引：0

作者：

Alt, Bastian ^{[1
]}

Schultheis, Matthias ^{[1
,2
]}

Koeppl, Heinz ^{[1
,2
]}

机构：

[1] Tech Univ Darmstadt, Dept Elect Engn & Informat Technol, Darmstadt, Germany

[2] Tech Univ Darmstadt, Ctr Cognit Sci, Darmstadt, Germany

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

基金：

欧洲研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.

引用

页数：12

共 50 条

[31] Verification in continuous time by discrete reasoning
deAlfaro, L
Manna, Z
ALGEBRAIC METHODOLOGY AND SOFTWARE TECHNOLOGY, 1995, 936 : 292 - 306
[32] Continuous and Discrete Time: Scientific Possibilities
Montemayor, Carlos
KRONOSCOPE-JOURNAL FOR THE STUDY OF TIME, 2012, 12 (01): : 52 - 72
[33] On Subgrouping Continuous Processes in Discrete Time
Park, Jonathan J.
Fisher, Zachary
Chow, Sy-Miin
Molenaar, Peter C. M.
MULTIVARIATE BEHAVIORAL RESEARCH, 2023, 58 (01) : 154 - 155
[34] Growth of semigroups in discrete and continuous time
Gomilko, Alexander
Zwart, Hans
Besseling, Niels
STUDIA MATHEMATICA, 2011, 206 (03) : 273 - 292
[35] On the discrete time and continuous states models
Takahashi, H
HITOTSUBASHI JOURNAL OF ECONOMICS, 1997, 38 (02) : 125 - 137
[36] Representation of continuous change with discrete time
Barber, F
Moreno, S
FOURTH INTERNATIONAL WORKSHOP ON TEMPORAL REPRESENTATION AND REASONING, PROCEEDINGS, 1997, : 175 - 179
[37] ON IMBEDDING DISCRETE CHAINS IN CONTINUOUS TIME
SENETA, E
AUSTRALIAN JOURNAL OF STATISTICS, 1967, 9 (01): : 1 - &
[38] Martingale BMO spaces with continuous time
Weisz, F.
Analysis Mathematica, 22 (01):
[39] Psychophysics without physics: extension of Fechnerian scaling from continuous to discrete and discrete-continuous stimulus spaces
Dzhafarov, EN
Colonius, H
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2005, 49 (02) : 125 - 141
[40] Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions
Ahmadi, Mohamadreza
Singletary, Andrew
Burdick, Joel W.
Ames, Aaron D.
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4797 - 4803

← 1 2 3 4 5 →