Finite horizon partially observable semi-Markov decision processes under risk probability criteria

被引:0
|
作者
Wen, Xin [1 ]
Guo, Xianping [2 ,3 ]
Xia, Li [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Business, Guangzhou, Peoples R China
[2] Sun Yat Sen Univ, Sch Math, Guangzhou, Peoples R China
[3] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Partially observable semi-Markov decision; processes; Risk probability criterion; Finite horizon; Optimal Markov policy; INCOMPLETE INFORMATION; SENSITIVE CONTROL;
D O I
10.1016/j.orl.2024.107187
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper deals with a risk probability minimization problem for finite horizon partially observable semi-Markov decision processes, which are the fairly most general models for stochastic dynamic systems. In contrast to the expected discounted and average criteria, the optimality investigated in this paper is to minimize the probability that the accumulated rewards do not reach a prescribed profit level at the finite terminal stage. First, the state space is augmented as the joint conditional distribution of the current unobserved state and the remaining profit goal. We introduce a class of policies depending on observable histories and a class of Markov policies including observable process with the joint conditional distribution. Then under mild assumptions, we prove that the value function is the unique solution to the optimality equation for the probability criterion by using iteration techniques. The existence of (& varepsilon;-)optimal Markov policy for this problem is established. Finally, we use a bandit problem with the probability criterion to demonstrate our main results in which an effective algorithm and the corresponding numerical calculation are given for the semi-Markov model. Moreover, for the case of reduction to the discrete-time Markov model, we derive a concise solution.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Partially Observable Markov Decision Processes and Robotics
    Kurniawati, Hanna
    ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
  • [32] A tutorial on partially observable Markov decision processes
    Littman, Michael L.
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
  • [33] A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates
    Huang XiangXiang
    Zou XiaoLong
    Guo XianPing
    SCIENCE CHINA-MATHEMATICS, 2015, 58 (09) : 1923 - 1938
  • [34] Quantum partially observable Markov decision processes
    Barry, Jennifer
    Barry, Daniel T.
    Aaronson, Scott
    PHYSICAL REVIEW A, 2014, 90 (03):
  • [35] Average criteria in denumerable semi-Markov decision chains under risk-aversion
    Cavazos-Cadena, Rolando
    Cruz-Suarez, Hugo
    Montes-De-Oca, Raul
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2023, 33 (03): : 221 - 256
  • [36] A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates
    HUANG XiangXiang
    ZOU XiaoLong
    GUO XianPing
    ScienceChina(Mathematics), 2015, 58 (09) : 1923 - 1938
  • [37] A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates
    XiangXiang Huang
    XiaoLong Zou
    XianPing Guo
    Science China Mathematics, 2015, 58 : 1923 - 1938
  • [38] MEAN-VARIANCE OPTIMALITY FOR SEMI-MARKOV DECISION PROCESSES UNDER FIRST PASSAGE CRITERIA
    Huang, Xiangxiang
    Huang, Yonghui
    KYBERNETIKA, 2017, 53 (01) : 59 - 81
  • [39] Partially observable Markov decision processes for risk-based screening
    Mrozack, Alex
    Liao, Xuejun
    Skatter, Sondre
    Carin, Lawrence
    ANOMALY DETECTION AND IMAGING WITH X-RAYS (ADIX), 2016, 9847
  • [40] OBSERVABLE AUGMENTED SYSTEMS FOR SENSITIVITY ANALYSIS OF MARKOV AND SEMI-MARKOV PROCESSES
    CASSANDRAS, CG
    STRICKLAND, SG
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (10) : 1026 - 1037