Optimal Markov Policies for Finite-Horizon Constrained MDPs With Combined Additive and Multiplicative Utilities

被引：0

作者：

Kumar, Uday M. ^{[1
]}

Kavitha, Veeraruna ^{[2
]}

Bhat, Sanjay P. ^{[1
]}

Hemachandra, Nandyala ^{[2
]}

机构：

[1] TCS Res, Hyderabad 500081, India

[2] Indian Inst Technol, Dept Ind Engn & Operat Res, Mumbai 400076, India

来源：

IEEE CONTROL SYSTEMS LETTERS | 2023年 / 7卷

关键词：

Bilinear program; Markov decision processes; Markov policies; Optimal control; utilities;

D O I：

10.1109/LCSYS.2023.3283470

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This letter considers the problem of optimizing a finite-horizon constrained Markov decision process (CMDP) where the objective and constraints are sums of additive and multiplicative utilities. To solve this, we construct another CMDP with only additive utilities whose optimal value over a restricted set of policies is equal to that of the original CMDP. Further, we provide a finite-dimensional bilinear program (BLP) whose value equals the CMDP value and whose solution provides the optimal policy. We also suggest an algorithm to solve the proposed BLP.

引用

页码：2029 / 2034

页数：6

共 50 条

[41] Finite-horizon Markov population decision chains with constant risk posture
White, Amanda M.
Canbolat, Pelin G.
NAVAL RESEARCH LOGISTICS, 2018, 65 (08) : 580 - 593
[42] OPTIMAL FINITE-HORIZON APPROXIMATION OF UNSTABLE LINEAR-SYSTEMS
GUILLAUME, AM
KABAMBA, PT
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1985, 8 (02) : 278 - 280
[43] Finite-horizon optimal consumption and investment problem with a preference change
Park, Kyunghyun
Jeon, Junkee
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2019, 472 (02) : 1777 - 1802
[44] FINITE-HORIZON OPTIMAL-CONTROL WITH POINTWISE COST FUNCTIONAL
PICCARDI, C
APPLIED MATHEMATICS AND COMPUTATION, 1992, 52 (2-3) : 345 - 353
[45] Optimal Finite-Horizon Sensor Selection for Boolean Kalman Filter
Imani, Mahdi
Braga-Neto, Ulisses M.
2017 FIFTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2017, : 1481 - 1485
[46] Adaptive epidemic dissemination as a finite-horizon optimal stopping problem
Kontos, T.
Anagnostopoulos, C.
Zervas, E.
Hadjiefthymiades, S.
WIRELESS NETWORKS, 2019, 25 (05) : 2315 - 2332
[47] PARALLEL BAYESIAN POLICIES FOR FINITE-HORIZON MULTIPLE COMPARISONS WITH A KNOWN STANDARD
Hu, Weici
Frazier, Peter I.
Xie, Jing
PROCEEDINGS OF THE 2014 WINTER SIMULATION CONFERENCE (WSC), 2014, : 3904 - 3915
[48] OPTIMAL PORTFOLIO AND CONSUMPTION DECISIONS FOR A SMALL INVESTOR ON A FINITE-HORIZON
KARATZAS, I
LEHOCZKY, JP
SHREVE, SE
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1987, 25 (06) : 1557 - 1586
[49] Adaptive epidemic dissemination as a finite-horizon optimal stopping problem
T. Kontos
C. Anagnostopoulos
E. Zervas
S. Hadjiefthymiades
Wireless Networks, 2019, 25 : 2315 - 2332
[50] Approximate finite-horizon optimal control without PDE's
Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, United Kingdom
不详
Proc IEEE Conf Decis Control, (1716-1721):

← 1 2 3 4 5 →