Planning with Abstract Markov Decision Processes

被引:0
|
作者
Gopalan, Nakul [1 ]
desJardins, Marie [2 ]
Littman, Michael L. [1 ]
MacGlashan, James [3 ]
Squire, Shawn [2 ]
Tellex, Stefanie [1 ]
Winder, John [2 ]
Wong, Lawson L. S. [1 ]
机构
[1] Brown Univ, Providence, RI 02912 USA
[2] Univ Maryland Baltimore Cty, Baltimore, MD 21250 USA
[3] Cogitai Inc, Riverside, RI USA
基金
美国国家科学基金会; 美国国家航空航天局;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robots acting in human-scale environments must plan under uncertainty in large state-action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state-action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level "flat" MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.
引用
收藏
页码:480 / 488
页数:9
相关论文
共 50 条
  • [21] Simple Regret Optimization in Online Planning for Markov Decision Processes
    Feldman, Zohar
    Domshlak, Carmel
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 165 - 205
  • [22] Markov decision processes
    White, D.J.
    Journal of the Operational Research Society, 1995, 46 (06):
  • [23] Markov Decision Processes
    Bäuerle N.
    Rieder U.
    Jahresbericht der Deutschen Mathematiker-Vereinigung, 2010, 112 (4) : 217 - 243
  • [24] Driving force planning in shield tunneling based on Markov decision processes
    HU XiangTao HUANG YongAn YIN ZhouPing XIONG YouLun State Key Laboratory of Digital Manufacturing Equipment Technology Huazhong University of Science Technology Wuhan China
    Science China(Technological Sciences), 2012, 55 (04) : 1022 - 1030
  • [25] Unifying nondeterministic and probabilistic planning through imprecise Markov Decision Processes
    Trevizan, Felipe W.
    Cozman, Fabio G.
    de Barros, Leliane N.
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA-SBIA 2006, PROCEEDINGS, 2006, 4140 : 502 - 511
  • [26] Driving force planning in shield tunneling based on Markov decision processes
    Hu XiangTao
    Huang YongAn
    Yin ZhouPing
    Xiong YouLun
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2012, 55 (04) : 1022 - 1030
  • [27] Robust motion planning using Markov Decision Processes and quadtree decomposition
    Burlet, J
    Aycard, O
    Fraichard, T
    2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2820 - 2825
  • [28] Driving force planning in shield tunneling based on Markov decision processes
    XiangTao Hu
    YongAn Huang
    ZhouPing Yin
    YouLun Xiong
    Science China Technological Sciences, 2012, 55 : 1022 - 1030
  • [29] Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
    Jonsson, Anders
    Kaufmann, Emilie
    Menard, Pierre
    Domingues, Omar Darwiche
    Leurent, Edouard
    Valko, Michal
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [30] A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
    Ross, Stephane
    Pineau, Joelle
    Chaib-draa, Brahim
    Kreitmann, Pierre
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1729 - 1770