Complexity Bounds for Deterministic Partially Observed Markov Decision Processes

被引:0
|
作者
Vessaire, Cyrille [1 ]
Carpentier, Pierre [2 ]
Chancelier, Jean-Philippe [1 ]
De Lara, Michel [1 ]
Rodriguez-Martinez, Alejandro [3 ]
机构
[1] Ecole Ponts ParisTech, CERMICS, 6 & 8 Ave Blaise Pascal, F-77455 Marne La Vallee, France
[2] IP Paris, UMA, ENSTA Paris, 828 Blvd Marechaux, F-91762 Palaiseau, France
[3] TotalEnergies SE, Pau, France
关键词
D O I
10.1007/s10479-024-06282-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Partially Observed Markov Decision Processes (Pomdp) share the structure of Markov Decision Processs (Mdp) - with stages, states, actions, probability transitions, rewards - but for the notion of solutions. In a Pomdp, observation mappings provide partial and/or imperfect knowledge of the state, and a policy maps observations (and not states like in a Mdp) towards actions. Theroretically, a Pomdp can be solved by Dynamic Programming (DP), but with an information state made of probability distributions over the original state, hence DP suffers from the curse of dimensionality, even in the finite case. This is why, authors like (Littman, M. L. 1996). Algorithms for Sequential Decision Making. PhD thesis, Brown University) and (Bonet, B. 2009). Deterministic POMDPs revisited. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI '09 (pp. 59-66). Arlington, Virginia, USA. AUAI Press) have studied the subclass of so-called Deterministic Partially Observed Markov Decision Processes (Det-Pomdp), where transitions and observations mappings are deterministic. In this paper, we improve on Littman's complexity bounds. We then introduce and study a more restricted class, Separated Det-Pomdps, and give some new complexity bounds for this class.
引用
收藏
页码:345 / 382
页数:38
相关论文
共 50 条
  • [21] ESTIMATION FOR PARTIALLY OBSERVED MARKOV-PROCESSES
    THOMPSON, ME
    KASEKE, TN
    STOCHASTIC HYDROLOGY AND HYDRAULICS, 1995, 9 (01): : 33 - 47
  • [22] Nearly deterministic abstractions of Markov decision processes
    Lane, T
    Kaelbling, LP
    EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 260 - 266
  • [23] Markov Decision Processes and deterministic Buchi automata
    Beauquier, D
    FUNDAMENTA INFORMATICAE, 2002, 50 (01) : 1 - 13
  • [24] Active Trajectory Estimation for Partially Observed Markov Decision Processes via Conditional Entropy
    Molloy, Timothy L.
    Nair, Girish N.
    2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 385 - 391
  • [25] FINITE-MEMORY SUBOPTIMAL DESIGN FOR PARTIALLY OBSERVED MARKOV DECISION-PROCESSES
    WHITE, CC
    SCHERER, WT
    OPERATIONS RESEARCH, 1994, 42 (03) : 439 - 455
  • [26] Constrained Partially Observed Markov Decision Processes With Probabilistic Criteria for Adaptive Sequential Detection
    Chen, Richard C.
    Wagner, Kevin
    Blankenship, Gilmer L.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2013, 58 (02) : 487 - 493
  • [27] Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes
    Kara, Ali Devran
    Yuksel, Serdar
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [28] Near Optimality of Finite Memory Feedback Policies in Partially Observed Markov Decision Processes
    Kara, Ali Devran
    Yüksel, Serdar
    Journal of Machine Learning Research, 2022, 23
  • [29] Complexity issues in Markov decision processes
    Goldsmith, J
    Mundhenk, M
    THIRTEENTH ANNUAL IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY - PROCEEDINGS, 1998, : 272 - 280
  • [30] THE COMPLEXITY OF MARKOV DECISION-PROCESSES
    PAPADIMITRIOU, CH
    TSITSIKLIS, JN
    MATHEMATICS OF OPERATIONS RESEARCH, 1987, 12 (03) : 441 - 450