POMDPs in Continuous Time and Discrete Spaces

被引:0
|
作者
Alt, Bastian [1 ]
Schultheis, Matthias [1 ,2 ]
Koeppl, Heinz [1 ,2 ]
机构
[1] Tech Univ Darmstadt, Dept Elect Engn & Informat Technol, Darmstadt, Germany
[2] Tech Univ Darmstadt, Ctr Cognit Sci, Darmstadt, Germany
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many processes, such as discrete event systems in engineering or population dynamics in biology, evolve in discrete space and continuous time. We consider the problem of optimal decision making in such discrete state and action space systems under partial observability. This places our work at the intersection of optimal filtering and optimal control. At the current state of research, a mathematical description for simultaneous decision making and filtering in continuous time with finite state and action spaces is still missing. In this paper, we give a mathematical description of a continuous-time partial observable Markov decision process (POMDP). By leveraging optimal filtering theory we derive a Hamilton-Jacobi-Bellman (HJB) type equation that characterizes the optimal solution. Using techniques from deep learning we approximately solve the resulting partial integro-differential equation. We present (i) an approach solving the decision problem offline by learning an approximation of the value function and (ii) an online algorithm which provides a solution in belief space using deep reinforcement learning. We show the applicability on a set of toy examples which pave the way for future methods providing solutions for high dimensional problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Solving POMDPs with Continuous or Large Discrete Observation Spaces
    Hoey, Jesse
    Poupart, Pascal
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1332 - 1338
  • [2] Parametric POMDPs for planning in continuous state spaces
    Brooks, Alex
    Makarenko, Alexei
    Williams, Stefan
    Durrant-Whyte, Hugh
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2006, 54 (11) : 887 - 897
  • [3] Approximate Control for Continuous-Time POMDPs
    Eich, Yannick
    Alt, Bastian
    Koeppl, Heinz
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [4] Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces
    Sunberg, Zachary N.
    Kochenderfer, Mykel J.
    TWENTY-EIGHTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING (ICAPS 2018), 2018, : 259 - 263
  • [5] Sparse Tree Search Optimality Guarantees in POMDPs with Continuous Observation Spaces
    Lim, Michael H.
    Tomlin, Claire J.
    Sunberg, Zachary N.
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4135 - 4142
  • [6] Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces
    Tas, Omer Sahin
    Hauser, Felix
    Lauer, Martin
    2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 1081 - 1088
  • [7] EVOLUTION EQUATIONS IN DISCRETE AND CONTINUOUS TIME FOR NONEXPANSIVE OPERATORS IN BANACH SPACES
    Vigeral, Guillaume
    ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2010, 16 (04) : 809 - 832
  • [8] Observation-Based Optimization for POMDPs With Continuous State, Observation, and Action Spaces
    Jiang, Xiaofeng
    Yang, Jian
    Tan, Xiaobin
    Xi, Hongsheng
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (05) : 2045 - 2052
  • [9] Sampling-based Algorithms for Continuous-time POMDPs
    Chaudhari, Pratik
    Karaman, Sertac
    Hsu, David
    Frazzoli, Emilio
    2013 AMERICAN CONTROL CONFERENCE (ACC), 2013, : 4604 - 4610
  • [10] On discrete and continuous quotient Riesz spaces
    Wnuk, Witold
    POSITIVITY, 2011, 15 (01) : 73 - 85