Episodic task learning in Markov decision processes

被引:0
|
作者
Yong Lin
Fillia Makedon
Yurong Xu
机构
[1] Computer Science & Engineering,
[2] Oracle Corporation,undefined
来源
关键词
Optimal Policy; Task State; Markov Decision Process; Belief State; Hierarchical Approach;
D O I
暂无
中图分类号
学科分类号
摘要
Hierarchical algorithms for Markov decision processes have been proved to be useful for the problem domains with multiple subtasks. Although the existing hierarchical approaches are strong in task decomposition, they are weak in task abstraction, which is more important for task analysis and modeling. In this paper, we propose a task-oriented design to strengthen the task abstraction. Our approach learns an episodic task model from the problem domain, with which the planner obtains the same control effect, with concise structure and much improved performance than the original model. According to our analysis and experimental evaluation, our approach has better performance than the existing hierarchical algorithms, such as MAXQ and HEXQ.
引用
收藏
页码:87 / 98
页数:11
相关论文
共 50 条
  • [21] Learning Policies for Markov Decision Processes in Continuous Spaces
    Paternain, Santiago
    Bazerque, Juan Andres
    Small, Austin
    Ribeiro, Alejandro
    2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 4751 - 4758
  • [22] Active Learning of Markov Decision Processes for System Verification
    Chen, Yingke
    Nielsen, Thomas Dyhre
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 289 - 294
  • [23] Active learning in partially observable Markov decision processes
    Jaulmes, R
    Pineau, J
    Precup, D
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
  • [24] Learning Policies for Markov Decision Processes From Data
    Hanawal, Manjesh Kumar
    Liu, Hao
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (06) : 2298 - 2309
  • [25] Concurrent Markov decision processes for robot team learning
    Girard, Justin
    Emami, M. Reza
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 39 : 223 - 234
  • [26] Learning Adversarial Markov Decision Processes with Delayed Feedback
    Lancewicki, Tal
    Rosenberg, Aviv
    Mansour, Yishay
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7281 - 7289
  • [27] A reinforcement learning based algorithm for Markov decision processes
    Bhatnagar, S
    Kumar, S
    2005 International Conference on Intelligent Sensing and Information Processing, Proceedings, 2005, : 199 - 204
  • [28] Verification of Markov Decision Processes Using Learning Algorithms
    Brazdil, Tomas
    Chatterjee, Krishnendu
    Chmelik, Martin
    Forejt, Vojtech
    Kretinsky, Jan
    Kwiatkowska, Marta
    Parker, David
    Ujma, Mateusz
    AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, ATVA 2014, 2014, 8837 : 98 - 114
  • [29] Recursive learning automata approach to Markov decision processes
    Chang, Hyeong Soo
    Fu, Michael C.
    Hu, Jiaqiao
    Marcus, Steven I.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (07) : 1349 - 1355
  • [30] Learning and Planning with Timing Information in Markov Decision Processes
    Bacon, Pierre-Luc
    Balle, Borja
    Precup, Doina
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 111 - 120