Piecewise linear value function approximation for factored MDPs

被引:0
|
作者
Poupart, P [1 ]
Boutilier, C [1 ]
Patrascu, R [1 ]
Schuurmans, D [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3H5, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored. One recent class of methods involves linear value function approximation, where the optimal value function is assumed to be a linear combination of some set of basis functions, with the aim of finding suitable weights. While sophisticated techniques have been developed for finding the best approximation within this constrained space, few methods have been proposed for choosing a suitable basis set, or modifying it if solution quality is found wanting. We propose a general framework, and specific proposals, that address. both of,these questions. In particular, we examine weakly coupled MDPS where a number of subtasks can be viewed independently modulo resource constraints. We then describe. methods for constructing a piecewise linear combination of the subtask value. functions, using greedy decision tree techniques. We argue that this architecture is suitable for many types of MDPs whose combinatorics are determined largely by the existence multiple conflicting objectives.
引用
收藏
页码:292 / 299
页数:8
相关论文
共 50 条
  • [31] Causal graph based decomposition of factored MDPs
    Jonsson, Anders
    Barto, Andrew
    JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 2259 - 2301
  • [32] An MCMC Approach to Solving Hybrid Factored MDPs
    Kveton, Branislav
    Hauskrecht, Milos
    19TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-05), 2005, : 1346 - 1351
  • [33] Factored MDPs for Detecting Topics of User Sessions
    Tavakol, Maryam
    Brefeld, Ulf
    PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, : 33 - 40
  • [34] Scalable Initial State Interdiction for Factored MDPs
    Panda, Swetasudha
    Vorobeychik, Yevgeniy
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4801 - 4807
  • [35] Near-Optimal Interdiction of Factored MDPs
    Panda, Swetasudha
    Vorobeychik, Yevgeniy
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [36] PIECEWISE LINEAR ORTHOGONAL APPROXIMATION
    App, Andreas
    Reif, Ulrich
    SIAM JOURNAL ON NUMERICAL ANALYSIS, 2010, 48 (03) : 840 - 856
  • [37] TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
    Kozlova, Olga
    Sigaud, Olivier
    Meyer, Christophe
    FROM ANIMALS TO ANIMATS 11, 2010, 6226 : 489 - +
  • [38] PIECEWISE LINEAR-APPROXIMATION OF ATTENUATION FUNCTION WITH SPECIFIED SLOPE AND RANGE
    DAYEM, R
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1975, 299 (02): : 77 - 87
  • [39] General Function Evaluation in a STPC Setting via Piecewise Linear Approximation
    Pignata, Tommaso
    Lazzeretti, Riccardo
    Barni, Mauro
    2012 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS), 2012, : 55 - 60
  • [40] EFFICIENT PIECEWISE-LINEAR FUNCTION APPROXIMATION USING THE UNIFORM METRIC
    GOODRICH, MT
    DISCRETE & COMPUTATIONAL GEOMETRY, 1995, 14 (04) : 445 - 462