Discounted fully probabilistic design of decision rules

被引:0
|
作者
Karny, Miroslav [1 ]
Molnarova, Sona [1 ]
机构
[1] Czech Acad Sci, Inst Informat Theory & Automat, Dept Adapt Syst, Vodarenskou Vezi 4, Prague 18200 8, Czech Republic
关键词
Design principles; Kullback-Leibler's divergence; Probabilistic techniques; Discounting; Closed loop; DIVERGENCE; PREFERENCE;
D O I
10.1016/j.ins.2024.121578
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Axiomatic fully probabilistic design (FPD) of optimal decision rules strictly extends the decision making (DM) theory represented by Markov decision processes (MDP). This means that any MDP task can be approximated by an explicitly found FPD task whereas many FPD tasks have no MDP equivalent. MDP and FPD model the closed loop - the coupling of an agent and its environment - via a joint probability density (pd) relating the involved random variables, referred to as behaviour. Unlike MDP, FPD quantifies agent's aims and constraints by an ideal pd. The ideal pd is high on the desired behaviours, small on undesired behaviours and zero on forbidden ones. FPD selects the optimal decision rules as the minimiser of Kullback-Leibler's divergence of the closed-loop-modelling pd to its ideal twin. The proximity measure choice follows from the FPD axiomatics. MDP minimises the expected total loss, which is usually the sum of discounted partial losses. The discounting reflects the decreasing importance of future losses. It also diminishes the influence of errors caused by: the imperfection of the employed environment model; roughly-expressed aims; the approximate learning and decision-rules design. The established FPD cannot currently account for these important features. The paper elaborates the missing discounted version of FPD. This non-trivial filling of the gap in FPD also employs an extension of dynamic programming, which is of an independent interest.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] THE STRUCTURE OF COALITIONAL POWER UNDER PROBABILISTIC GROUP DECISION RULES
    BANDYOPADHYAY, T
    DEB, R
    PATTANAIK, PK
    JOURNAL OF ECONOMIC THEORY, 1982, 27 (02) : 366 - 375
  • [22] Joint dynamic probabilistic constraints with projected linear decision rules
    Guigues, V.
    Henrion, R.
    OPTIMIZATION METHODS & SOFTWARE, 2017, 32 (05): : 1006 - 1032
  • [23] Correction to: The doctrinal paradox: comparison of decision rules in a probabilistic framework
    Aureli Alabert
    Mercè Farré
    Social Choice and Welfare, 2022, 58 : 897 - 899
  • [24] Discounted Properties of Probabilistic Pushdown Automata
    Brazdil, Tomas
    Brozek, Vaclav
    Holecek, Jan
    Kucera, Antonin
    LOGIC FOR PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND REASONING, PROCEEDINGS, 2008, 5330 : 230 - 242
  • [25] Fully probabilistic control design in an adaptive critic framework
    Herzallah, Randa
    Karny, Miroslav
    NEURAL NETWORKS, 2011, 24 (10) : 1128 - 1135
  • [26] THE WAY FOR OPTIMISING OF CONCRETE STRUCTURES: FULLY PROBABILISTIC DESIGN
    Stepanek, Petr
    Lanikova, Ivana
    Simunek, Petr
    DESIGN OF CONCRETE STRUCTURES USING EN 1992-1-1, 2010, : 291 - 298
  • [27] Critical review of fully probabilistic design for seismic loadings
    Huh, J
    Mehrabian, A
    Haldar, A
    Salazar, AR
    STRUCTURAL ENGINEERING IN THE 21ST CENTURY, 1999, : 480 - 483
  • [28] A fully probabilistic design for stochastic systems with input delay
    Herzallah, Randa
    INTERNATIONAL JOURNAL OF CONTROL, 2021, 94 (11) : 2934 - 2944
  • [29] A Probabilistic Evaluation of Wind Turbine Fatigue Design Rules
    Veldkamp, Dick
    WIND ENERGY, 2008, 11 (06) : 655 - 672
  • [30] DECISION RULES, DECISION RULES
    KENDLER, HH
    BEHAVIORAL AND BRAIN SCIENCES, 1978, 1 (01) : 64 - 65