Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引：0

作者：

Ghasemi, Mahsa ^{[1
]}

Topcu, Ufuk ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

关键词：

APPROXIMATIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

引用

页码：2371 / 2377

页数：7

共 50 条

[41] STRUCTURAL RESULTS FOR PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES
ALBRIGHT, SC
OPERATIONS RESEARCH, 1979, 27 (05) : 1041 - 1053
[42] MEDICAL TREATMENTS USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
Goulionis, John E.
JP JOURNAL OF BIOSTATISTICS, 2009, 3 (02) : 77 - 97
[43] Qualitative Analysis of Partially-Observable Markov Decision Processes
Chatterjee, Krishnendu
Doyen, Laurent
Henzinger, Thomas A.
MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
[44] Equivalence Relations in Fully and Partially Observable Markov Decision Processes
Castro, Pablo Samuel
Panangaden, Prakash
Precup, Doina
21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1653 - 1658
[45] Recursively-Constrained Partially Observable Markov Decision Processes
Ho, Qi Heng
Becker, Tyler
Kraske, Benjamin
Laouar, Zakariya
Feather, Martin S.
Rossi, Federico
Lahijanian, Morteza
Sunberg, Zachary
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2024, 244 : 1658 - 1680
[46] A Fast Approximation Method for Partially Observable Markov Decision Processes
LIU Bingbing
KANG Yu
JIANG Xiaofeng
QIN Jiahu
JournalofSystemsScience&Complexity, 2018, 31 (06) : 1423 - 1436
[47] Active Chemical Sensing With Partially Observable Markov Decision Processes
Gosangi, Rakesh
Gutierrez-Osuna, Ricardo
OLFACTION AND ELECTRONIC NOSE, PROCEEDINGS, 2009, 1137 : 562 - 565
[48] Stochastic optimization of controlled partially observable Markov decision processes
Bartlett, PL
Baxter, J
PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 124 - 129
[49] Reinforcement learning algorithm for partially observable Markov decision processes
Wang, Xue-Ning
He, Han-Gen
Xu, Xin
Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
[50] Partially Observable Markov Decision Processes and Performance Sensitivity Analysis
Li, Yanjie
Yin, Baoqun
Xi, Hongsheng
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1645 - 1651

← 1 2 3 4 5 →