Deriving Explicit Control Policies for Markov Decision Processes Using Symbolic Regression

被引:2
|
作者
Hristov, A. [1 ]
Bosman, J. W. [1 ]
Bhulai, S. [2 ]
van der Mei, R. D. [1 ]
机构
[1] Ctr Math & Comp Sci, Stochast Grp, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands
关键词
Markov Decision Processes; Genetic program; Symbolic regression; Threshold-type policy; Optimal control; Closedform approximation;
D O I
10.1145/3388831.3388840
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.
引用
收藏
页码:41 / 47
页数:7
相关论文
共 50 条
  • [1] Approximation of Stationary Control Policies by Quantized Control in Markov Decision Processes
    Saldi, Noel
    Linder, Tamas
    Yueksel, Serdar
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 78 - 84
  • [2] On Markov policies for minimax decision processes
    Iwamoto, S
    Tsurusaki, K
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2001, 253 (01) : 58 - 78
  • [3] Optimal Decision Tree Policies for Markov Decision Processes
    Vos, Daniel
    Verwer, Sicco
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5457 - 5465
  • [4] Optimal Policies for Quantum Markov Decision Processes
    Ying, Ming-Sheng
    Feng, Yuan
    Ying, Sheng-Gang
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (03) : 410 - 421
  • [5] IDENTIFICATION OF OPTIMAL POLICIES IN MARKOV DECISION PROCESSES
    Sladky, Karel
    KYBERNETIKA, 2010, 46 (03) : 558 - 570
  • [6] Least Inferable Policies for Markov Decision Processes
    Karabag, Mustafa O.
    Ornik, Melkior
    Topcu, Ufuk
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 1224 - 1231
  • [7] Ranking policies in discrete Markov decision processes
    Peng Dai
    Judy Goldsmith
    Annals of Mathematics and Artificial Intelligence, 2010, 59 : 107 - 123
  • [8] Optimal adaptive policies for Markov decision processes
    Burnetas, AN
    Katehakis, MN
    MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
  • [9] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    International Journal of Automation and Computing, 2021, 18 (03) : 410 - 421
  • [10] Optimal Policies for Quantum Markov Decision Processes
    Ming-Sheng Ying
    Yuan Feng
    Sheng-Gang Ying
    International Journal of Automation and Computing, 2021, 18 : 410 - 421