Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引：0

作者：

Ghasemi, Mahsa ^{[1
]}

Topcu, Ufuk ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

关键词：

APPROXIMATIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.

引用

页码：2371 / 2377

页数：7

共 50 条

[31] THE PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES FRAMEWORK IN MEDICAL DECISION MAKING
Goulionis, John E.
Stengos, Dimitrios I.
ADVANCES AND APPLICATIONS IN STATISTICS, 2008, 9 (02) : 205 - 232
[32] Point-based value iteration for continuous POMDPs
Porta, Josep M.
Vlassis, Nikos
Spaan, Matthijs T. J.
Poupart, Pascal
JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367
[33] Point-based value iteration for continuous POMDPs
Institut de Robòtica i Informàtica Industrial, UPC-CSIC, Llorens i Artigas 4-6, 08028, Barcelona, Spain
不详
不详
J. Mach. Learn. Res., 2006, (2329-2367):
[34] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS
Goulionis, John
Stengos, D.
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2011, 10 (06) : 1175 - 1197
[35] An Argument for the Bayesian Control of Partially Observable Markov Decision Processes
Vargo, Erik
Cogill, Randy
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (10) : 2796 - 2800
[36] Partially observable Markov decision processes for spoken dialog systems
Williams, Jason D.
Young, Steve
COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 393 - 422
[37] Learning deterministic policies in partially observable Markov decision processes
Miyazaki, K
Kobayashi, S
INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
[38] Nonmyopic multiaspect sensing with partially observable Markov decision processes
Ji, Shihao
Parr, Ronald
Carin, Lawrence
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (06) : 2720 - 2730
[39] A Fast Approximation Method for Partially Observable Markov Decision Processes
Bingbing Liu
Yu Kang
Xiaofeng Jiang
Jiahu Qin
Journal of Systems Science and Complexity, 2018, 31 : 1423 - 1436
[40] Partially Observable Markov Decision Processes incorporating epistemic uncertainties
Faddoul, R.
Raphael, W.
Soubra, A. -H.
Chateauneuf, A.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 241 (02) : 391 - 401

← 1 2 3 4 5 →