Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引：0

作者：

Ghasemi, Mahsa ^{[1
]}

Topcu, Ufuk ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

关键词：

APPROXIMATIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

引用

页码：2371 / 2377

页数：7

共 50 条

[21] Decentralized Control of Partially Observable Markov Decision Processes
Amato, Christopher
Chowdhary, Girish
Geramifard, Alborz
Uere, N. Kemal
Kochenderfer, Mykel J.
2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 2398 - 2405
[22] Partially observable Markov decision processes with reward information
Cao, XR
Guo, XP
2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 4393 - 4398
[23] On Anderson Acceleration for Partially Observable Markov Decision Processes
Ermis, Melike
Park, Mingyu
Yang, Insoon
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 4478 - 4485
[24] Transition Entropy in Partially Observable Markov Decision Processes
Melo, Francisco S.
Ribeiro, Isabel
INTELLIGENT AUTONOMOUS SYSTEMS 9, 2006, : 282 - +
[25] Minimal Disclosure in Partially Observable Markov Decision Processes
Bertrand, Nathalie
Genest, Blaise
IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011), 2011, 13 : 411 - 422
[26] Partially Observable Markov Decision Processes in Robotics: A Survey
Lauri, Mikko
Hsu, David
Pajarinen, Joni
IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) : 21 - 40
[27] A primer on partially observable Markov decision processes (POMDPs)
Chades, Iadine
Pascal, Luz V.
Nicol, Sam
Fletcher, Cameron S.
Ferrer-Mestres, Jonathan
METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (11): : 2058 - 2072
[28] Partially observable Markov decision processes with imprecise parameters
Itoh, Hideaki
Nakamura, Kiyohiko
ARTIFICIAL INTELLIGENCE, 2007, 171 (8-9) : 453 - 490
[29] Nonapproximability results for partially observable Markov decision processes
Lusena, Cristopher
Goldsmith, Judy
Mundhenk, Martin
1600, Morgan Kaufmann Publishers (14):
[30] Value set iteration for Markov decision processes
Chang, Hyeong Soo
AUTOMATICA, 2014, 50 (07) : 1940 - 1943

← 1 2 3 4 5 →