Planning with Abstract Markov Decision Processes

被引：0

作者：

Gopalan, Nakul ^{[1
]}

desJardins, Marie ^{[2
]}

Littman, Michael L. ^{[1
]}

MacGlashan, James ^{[3
]}

Squire, Shawn ^{[2
]}

Tellex, Stefanie ^{[1
]}

Winder, John ^{[2
]}

Wong, Lawson L. S. ^{[1
]}

机构：

[1] Brown Univ, Providence, RI 02912 USA

[2] Univ Maryland Baltimore Cty, Baltimore, MD 21250 USA

[3] Cogitai Inc, Riverside, RI USA

来源：

TWENTY-SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING | 2017年

基金：

美国国家科学基金会; 美国国家航空航天局;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Robots acting in human-scale environments must plan under uncertainty in large state-action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state-action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level "flat" MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.

引用

页码：480 / 488

页数：9

共 50 条

[21] Simple Regret Optimization in Online Planning for Markov Decision Processes
Feldman, Zohar
Domshlak, Carmel
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 165 - 205
[22] Markov decision processes
White, D.J.
Journal of the Operational Research Society, 1995, 46 (06):
[23] Markov Decision Processes
Bäuerle N.
Rieder U.
Jahresbericht der Deutschen Mathematiker-Vereinigung, 2010, 112 (4) : 217 - 243
[24] Driving force planning in shield tunneling based on Markov decision processes
HU XiangTao HUANG YongAn YIN ZhouPing XIONG YouLun State Key Laboratory of Digital Manufacturing Equipment Technology Huazhong University of Science Technology Wuhan China
Science China(Technological Sciences), 2012, 55 (04) : 1022 - 1030
[25] Unifying nondeterministic and probabilistic planning through imprecise Markov Decision Processes
Trevizan, Felipe W.
Cozman, Fabio G.
de Barros, Leliane N.
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA-SBIA 2006, PROCEEDINGS, 2006, 4140 : 502 - 511
[26] Driving force planning in shield tunneling based on Markov decision processes
Hu XiangTao
Huang YongAn
Yin ZhouPing
Xiong YouLun
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2012, 55 (04) : 1022 - 1030
[27] Robust motion planning using Markov Decision Processes and quadtree decomposition
Burlet, J
Aycard, O
Fraichard, T
2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2820 - 2825
[28] Driving force planning in shield tunneling based on Markov decision processes
XiangTao Hu
YongAn Huang
ZhouPing Yin
YouLun Xiong
Science China Technological Sciences, 2012, 55 : 1022 - 1030
[29] Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
Jonsson, Anders
Kaufmann, Emilie
Menard, Pierre
Domingues, Omar Darwiche
Leurent, Edouard
Valko, Michal
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[30] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
Ross, Stephane
Pineau, Joelle
Chaib-draa, Brahim
Kreitmann, Pierre
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770

← 1 2 3 4 5 →