Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

被引:0
|
作者
Ghasemi, Mahsa [1 ]
Topcu, Ufuk [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
关键词
APPROXIMATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In conventional partially observable Markov decision processes, the observations that the agent receives originate from fixed known distributions. However, in a variety of real-world scenarios, the agent has an active role in its perception by selecting which observations to receive. We avoid combinatorial expansion of the action space from integration of planning and perception decisions, through a greedy strategy for observation selection that minimizes an information-theoretic measure of the state uncertainty. We develop a novel point-based value iteration algorithm that incorporates this greedy strategy to pick perception actions for each sampled belief point in each iteration. As a result, not only the solver requires less belief points to approximate the reachable subspace of the belief simplex, but it also requires less computation per iteration. Further, we prove that the proposed algorithm achieves a near-optimal guarantee on value function with respect to an optimal perception strategy, and demonstrate its performance empirically.
引用
收藏
页码:2371 / 2377
页数:7
相关论文
共 50 条
  • [31] THE PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES FRAMEWORK IN MEDICAL DECISION MAKING
    Goulionis, John E.
    Stengos, Dimitrios I.
    ADVANCES AND APPLICATIONS IN STATISTICS, 2008, 9 (02) : 205 - 232
  • [32] Point-based value iteration for continuous POMDPs
    Porta, Josep M.
    Vlassis, Nikos
    Spaan, Matthijs T. J.
    Poupart, Pascal
    JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2329 - 2367
  • [33] Point-based value iteration for continuous POMDPs
    Institut de Robòtica i Informàtica Industrial, UPC-CSIC, Llorens i Artigas 4-6, 08028, Barcelona, Spain
    不详
    不详
    J. Mach. Learn. Res., 2006, (2329-2367):
  • [34] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS
    Goulionis, John
    Stengos, D.
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2011, 10 (06) : 1175 - 1197
  • [35] An Argument for the Bayesian Control of Partially Observable Markov Decision Processes
    Vargo, Erik
    Cogill, Randy
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (10) : 2796 - 2800
  • [36] Partially observable Markov decision processes for spoken dialog systems
    Williams, Jason D.
    Young, Steve
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 393 - 422
  • [37] Learning deterministic policies in partially observable Markov decision processes
    Miyazaki, K
    Kobayashi, S
    INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5, 1998, : 250 - 257
  • [38] Nonmyopic multiaspect sensing with partially observable Markov decision processes
    Ji, Shihao
    Parr, Ronald
    Carin, Lawrence
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (06) : 2720 - 2730
  • [39] A Fast Approximation Method for Partially Observable Markov Decision Processes
    Bingbing Liu
    Yu Kang
    Xiaofeng Jiang
    Jiahu Qin
    Journal of Systems Science and Complexity, 2018, 31 : 1423 - 1436
  • [40] Partially Observable Markov Decision Processes incorporating epistemic uncertainties
    Faddoul, R.
    Raphael, W.
    Soubra, A. -H.
    Chateauneuf, A.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 241 (02) : 391 - 401