Optimal Markov Policies for Finite-Horizon Constrained MDPs With Combined Additive and Multiplicative Utilities

被引:0
|
作者
Kumar, Uday M. [1 ]
Kavitha, Veeraruna [2 ]
Bhat, Sanjay P. [1 ]
Hemachandra, Nandyala [2 ]
机构
[1] TCS Res, Hyderabad 500081, India
[2] Indian Inst Technol, Dept Ind Engn & Operat Res, Mumbai 400076, India
来源
关键词
Bilinear program; Markov decision processes; Markov policies; Optimal control; utilities;
D O I
10.1109/LCSYS.2023.3283470
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter considers the problem of optimizing a finite-horizon constrained Markov decision process (CMDP) where the objective and constraints are sums of additive and multiplicative utilities. To solve this, we construct another CMDP with only additive utilities whose optimal value over a restricted set of policies is equal to that of the original CMDP. Further, we provide a finite-dimensional bilinear program (BLP) whose value equals the CMDP value and whose solution provides the optimal policy. We also suggest an algorithm to solve the proposed BLP.
引用
收藏
页码:2029 / 2034
页数:6
相关论文
共 50 条
  • [41] Finite-horizon Markov population decision chains with constant risk posture
    White, Amanda M.
    Canbolat, Pelin G.
    NAVAL RESEARCH LOGISTICS, 2018, 65 (08) : 580 - 593
  • [42] OPTIMAL FINITE-HORIZON APPROXIMATION OF UNSTABLE LINEAR-SYSTEMS
    GUILLAUME, AM
    KABAMBA, PT
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1985, 8 (02) : 278 - 280
  • [43] Finite-horizon optimal consumption and investment problem with a preference change
    Park, Kyunghyun
    Jeon, Junkee
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2019, 472 (02) : 1777 - 1802
  • [44] FINITE-HORIZON OPTIMAL-CONTROL WITH POINTWISE COST FUNCTIONAL
    PICCARDI, C
    APPLIED MATHEMATICS AND COMPUTATION, 1992, 52 (2-3) : 345 - 353
  • [45] Optimal Finite-Horizon Sensor Selection for Boolean Kalman Filter
    Imani, Mahdi
    Braga-Neto, Ulisses M.
    2017 FIFTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2017, : 1481 - 1485
  • [46] Adaptive epidemic dissemination as a finite-horizon optimal stopping problem
    Kontos, T.
    Anagnostopoulos, C.
    Zervas, E.
    Hadjiefthymiades, S.
    WIRELESS NETWORKS, 2019, 25 (05) : 2315 - 2332
  • [47] PARALLEL BAYESIAN POLICIES FOR FINITE-HORIZON MULTIPLE COMPARISONS WITH A KNOWN STANDARD
    Hu, Weici
    Frazier, Peter I.
    Xie, Jing
    PROCEEDINGS OF THE 2014 WINTER SIMULATION CONFERENCE (WSC), 2014, : 3904 - 3915
  • [48] OPTIMAL PORTFOLIO AND CONSUMPTION DECISIONS FOR A SMALL INVESTOR ON A FINITE-HORIZON
    KARATZAS, I
    LEHOCZKY, JP
    SHREVE, SE
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1987, 25 (06) : 1557 - 1586
  • [49] Adaptive epidemic dissemination as a finite-horizon optimal stopping problem
    T. Kontos
    C. Anagnostopoulos
    E. Zervas
    S. Hadjiefthymiades
    Wireless Networks, 2019, 25 : 2315 - 2332
  • [50] Approximate finite-horizon optimal control without PDE's
    Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, United Kingdom
    不详
    Proc IEEE Conf Decis Control, (1716-1721):