Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引:0
|
作者
Ghasemi, Mahsa [1 ]
Topcu, Ufuk [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
关键词
APPROXIMATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.
引用
收藏
页码:2371 / 2377
页数:7
相关论文
共 50 条
  • [41] STRUCTURAL RESULTS FOR PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES
    ALBRIGHT, SC
    OPERATIONS RESEARCH, 1979, 27 (05) : 1041 - 1053
  • [42] MEDICAL TREATMENTS USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
    Goulionis, John E.
    JP JOURNAL OF BIOSTATISTICS, 2009, 3 (02) : 77 - 97
  • [43] Qualitative Analysis of Partially-Observable Markov Decision Processes
    Chatterjee, Krishnendu
    Doyen, Laurent
    Henzinger, Thomas A.
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
  • [44] Equivalence Relations in Fully and Partially Observable Markov Decision Processes
    Castro, Pablo Samuel
    Panangaden, Prakash
    Precup, Doina
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1653 - 1658
  • [45] Recursively-Constrained Partially Observable Markov Decision Processes
    Ho, Qi Heng
    Becker, Tyler
    Kraske, Benjamin
    Laouar, Zakariya
    Feather, Martin S.
    Rossi, Federico
    Lahijanian, Morteza
    Sunberg, Zachary
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2024, 244 : 1658 - 1680
  • [46] A Fast Approximation Method for Partially Observable Markov Decision Processes
    LIU Bingbing
    KANG Yu
    JIANG Xiaofeng
    QIN Jiahu
    JournalofSystemsScience&Complexity, 2018, 31 (06) : 1423 - 1436
  • [47] Active Chemical Sensing With Partially Observable Markov Decision Processes
    Gosangi, Rakesh
    Gutierrez-Osuna, Ricardo
    OLFACTION AND ELECTRONIC NOSE, PROCEEDINGS, 2009, 1137 : 562 - 565
  • [48] Stochastic optimization of controlled partially observable Markov decision processes
    Bartlett, PL
    Baxter, J
    PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 124 - 129
  • [49] Reinforcement learning algorithm for partially observable Markov decision processes
    Wang, Xue-Ning
    He, Han-Gen
    Xu, Xin
    Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
  • [50] Partially Observable Markov Decision Processes and Performance Sensitivity Analysis
    Li, Yanjie
    Yin, Baoqun
    Xi, Hongsheng
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1645 - 1651