Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引:0
|
作者
Ghasemi, Mahsa [1 ]
Topcu, Ufuk [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
关键词
APPROXIMATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.
引用
收藏
页码:2371 / 2377
页数:7
相关论文
共 50 条
  • [1] Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes
    Bouton, Maxime
    Tumova, Jana
    Kochenderfer, Mykel J.
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10061 - 10068
  • [2] A method for speeding up value iteration in partially observable Markov decision processes
    Zhang, NL
    Lee, SS
    Zhang, WH
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1999, : 696 - 703
  • [3] Speeding up the convergence of value iteration in partially observable Markov decision processes
    Zhang, NL
    Zhang, WH
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 29 - 51
  • [4] Value-Function Approximations for Partially Observable Markov Decision Processes
    Hauskrecht, Milos
    Journal of Artificial Intelligence Research, 2001, 13 (00): : 33 - 94
  • [5] Value-function approximations for partially observable Markov decision processes
    Hauskrecht, M
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2000, 13 : 33 - 94
  • [6] Partially Observable Markov Decision Processes and Robotics
    Kurniawati, Hanna
    ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
  • [7] A tutorial on partially observable Markov decision processes
    Littman, Michael L.
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
  • [8] Quantum partially observable Markov decision processes
    Barry, Jennifer
    Barry, Daniel T.
    Aaronson, Scott
    PHYSICAL REVIEW A, 2014, 90 (03):
  • [9] Online Active Perception for Partially Observable Markov Decision Processes with Limited Budget
    Ghasemi, Mahsa
    Topcu, Ufuk
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 6169 - 6174
  • [10] Approximate Value Iteration for Risk-Aware Markov Decision Processes
    Yu, Pengqian
    Haskell, William B.
    Xu, Huan
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (09) : 3135 - 3142