Deriving Explicit Control Policies for Markov Decision Processes Using Symbolic Regression

被引:2
|
作者
Hristov, A. [1 ]
Bosman, J. W. [1 ]
Bhulai, S. [2 ]
van der Mei, R. D. [1 ]
机构
[1] Ctr Math & Comp Sci, Stochast Grp, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands
关键词
Markov Decision Processes; Genetic program; Symbolic regression; Threshold-type policy; Optimal control; Closedform approximation;
D O I
10.1145/3388831.3388840
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.
引用
收藏
页码:41 / 47
页数:7
相关论文
共 50 条
  • [21] Efficient Policies for Stationary Possibilistic Markov Decision Processes
    Ben Amor, Nahla
    El Khalfi, Zeineb
    Fargier, Helene
    Sabaddin, Regis
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2017, 2017, 10369 : 306 - 317
  • [22] Learning Policies for Markov Decision Processes From Data
    Hanawal, Manjesh Kumar
    Liu, Hao
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (06) : 2298 - 2309
  • [23] A Note on Infectious Disease Control using Markov Decision Processes
    Maeda Y.
    IEEJ Transactions on Electronics, Information and Systems, 2022, 142 (03) : 339 - 340
  • [24] Mode-matching control policies for multi-mode Markov decision processes
    Ren, ZY
    Krogh, BH
    PROCEEDINGS OF THE 2001 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2001, : 95 - 100
  • [25] Sufficiency of Markov Policies for Continuous-Time Jump Markov Decision Processes
    Feinberg, Eugene A.
    Mandava, Manasa
    Shiryaev, Albert N.
    MATHEMATICS OF OPERATIONS RESEARCH, 2022, 47 (02) : 1266 - 1286
  • [26] Building Optimal Operation Policies for Dam Management Using Factored Markov Decision Processes
    Reyes, Alberto
    Ibargueengoytia, Pablo H.
    Romero, Ines
    Pech, David
    Borunda, Monica
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND ITS APPLICATIONS, MICAI 2015, PT II, 2015, 9414 : 475 - 484
  • [27] Learning Parameterized Prescription Policies and Disease Progression Dynamics using Markov Decision Processes
    Zhu, Henghui
    Xu, Tingting
    Paschalidis, Ioannis Ch
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 3438 - 3443
  • [28] Symbolic algorithms for qualitative analysis of Markov decision processes with Buchi objectives
    Chatterjee, Krishnendu
    Henzinger, Monika
    Joglekar, Manas
    Shah, Nisarg
    FORMAL METHODS IN SYSTEM DESIGN, 2013, 42 (03) : 301 - 327
  • [29] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
    Daniel Cruz-Suárez
    Raúl Montes-de-Oca
    Francisco Salem-Silva
    Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
  • [30] DISCOUNT-ISOTONE POLICIES FOR MARKOV DECISION-PROCESSES
    WHITE, DJ
    OR SPEKTRUM, 1988, 10 (01) : 13 - 22