Optimal Markov Policies for Finite-Horizon Constrained MDPs With Combined Additive and Multiplicative Utilities

被引：0

作者：

Kumar, Uday M. ^{[1
]}

Kavitha, Veeraruna ^{[2
]}

Bhat, Sanjay P. ^{[1
]}

Hemachandra, Nandyala ^{[2
]}

机构：

[1] TCS Res, Hyderabad 500081, India

[2] Indian Inst Technol, Dept Ind Engn & Operat Res, Mumbai 400076, India

来源：

IEEE CONTROL SYSTEMS LETTERS | 2023年 / 7卷

关键词：

Bilinear program; Markov decision processes; Markov policies; Optimal control; utilities;

D O I：

10.1109/LCSYS.2023.3283470

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This letter considers the problem of optimizing a finite-horizon constrained Markov decision process (CMDP) where the objective and constraints are sums of additive and multiplicative utilities. To solve this, we construct another CMDP with only additive utilities whose optimal value over a restricted set of policies is equal to that of the original CMDP. Further, we provide a finite-dimensional bilinear program (BLP) whose value equals the CMDP value and whose solution provides the optimal policy. We also suggest an algorithm to solve the proposed BLP.

引用

页码：2029 / 2034

页数：6

共 50 条

[31] Finite-horizon optimal investment with transaction costs: construction of the optimal strategies
Christoph Belak
Jörn Sass
Finance and Stochastics, 2019, 23 : 861 - 888
[32] Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics
Heydari, Ali
Balakrishnan, S.N.
Proceedings of the American Control Conference, 2011, : 3047 - 3052
[33] Neural network solution for finite-horizon H-infinity constrained optimal control of nonlinear systems
Cheng T.
Lewis F.L.
Journal of Control Theory and Applications, 2007, 5 (1): : 1 - 11
[34] Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics
Heydari, Ali
Balakrishnan, Sivasubramanya N.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (01) : 145 - 157
[35] Finite-horizon optimal investment with transaction costs: construction of the optimal strategies
Belak, Christoph
Sass, Joern
FINANCE AND STOCHASTICS, 2019, 23 (04) : 861 - 888
[36] Neural network solution for finite-horizon H-infinity constrained optimal control of nonlinear systems
Frank L.LEWIS
Journal of Control Theory and Applications, 2007, (01) : 1 - 11
[37] Finite-Horizon Input-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics
Heydari, Ali
Balakrishnan, S. N.
2011 AMERICAN CONTROL CONFERENCE, 2011, : 3047 - 3052
[38] FINITE-HORIZON MARKOV DECISION-PROCESSES WITH UNCERTAIN TERMINAL PAYOFFS
WHITE, DJ
OPERATIONS RESEARCH, 1995, 43 (05) : 862 - 869
[39] An extended ε-constraint method for a multiobjective finite-horizon Markov decision process
Eghbali-Zarch, Maryam
Tavakkoli-Moghaddam, Reza
Azaron, Amir
Dehghan-Sanej, Kazem
INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH, 2022, 29 (05) : 3131 - 3160
[40] The Impact of Structural Policies on External Accounts in Infinite-horizon and Finite-horizon Models
Vogel, Lukas
REVIEW OF INTERNATIONAL ECONOMICS, 2013, 21 (01) : 103 - 117

← 1 2 3 4 5 →