Piecewise linear value function approximation for factored MDPs

被引：0

作者：

Poupart, P ^{[1
]}

Boutilier, C ^{[1
]}

Patrascu, R ^{[1
]}

Schuurmans, D ^{[1
]}

机构：

[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3H5, Canada

来源：

EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored. One recent class of methods involves linear value function approximation, where the optimal value function is assumed to be a linear combination of some set of basis functions, with the aim of finding suitable weights. While sophisticated techniques have been developed for finding the best approximation within this constrained space, few methods have been proposed for choosing a suitable basis set, or modifying it if solution quality is found wanting. We propose a general framework, and specific proposals, that address. both of,these questions. In particular, we examine weakly coupled MDPS where a number of subtasks can be viewed independently modulo resource constraints. We then describe. methods for constructing a piecewise linear combination of the subtask value. functions, using greedy decision tree techniques. We argue that this architecture is suitable for many types of MDPs whose combinatorics are determined largely by the existence multiple conflicting objectives.

引用

页码：292 / 299

页数：8

共 50 条

[11] On a Piecewise Linear Function Approximation for Quantum Computation
Okazaki, Hideaki
2023 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2024, : 367 - 370
[12] Dynamic Regret of Adversarial MDPs with Unknown Transition and Linear Function Approximation
Li, Long-Fei
Zhao, Peng
Zhou, Zhi-Hua
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13572 - 13580
[13] Multiagent planning with factored MDPs
Guestrin, C
Koller, D
Parr, R
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1523 - 1530
[14] Symmetric approximate linear programming for factored MDPs with application to constrained problems
Dmitri A. Dolgov
Edmund H. Durfee
Annals of Mathematics and Artificial Intelligence, 2006, 47 : 273 - 293
[15] Symmetric approximate linear programming for factored MDPs with application to constrained problems
Dolgov, Dmitri A.
Durfee, Edmund H.
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2006, 47 (3-4) : 273 - 293
[16] Point-Based POMDP Solving with Factored Value Function Approximation
Veiga, Tiago S.
Spaan, Matthijs T. J.
Lima, Pedro U.
PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2512 - 2518
[17] Improving Value Function Approximation in Factored POMDPs by Exploiting Model Structure
Veiga, Tiago S.
Spaan, Matthijs T. J.
Lima, Pedro U.
PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1827 - 1828
[18] CALCULATION OF PIECEWISE LINEAR-APPROXIMATION TO A DISCRETE FUNCTION
VANDEWALLE, J
IEEE TRANSACTIONS ON COMPUTERS, 1975, 24 (08) : 843 - 846
[19] Efficient reinforcement learning in factored MDPs
Kearns, M
Koller, D
IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 740 - 747
[20] Efficient solution algorithms for factored MDPs
Guestrin, C
Koller, D
Parr, R
Venkataraman, S
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2003, 19 : 399 - 468

← 1 2 3 4 5 →