Monte Carlo Hierarchical Model Learning

被引:0
|
作者
Menashe, Jacob [1 ]
Stone, Peter [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
关键词
Single and multi-agent learning techniques; Reinforcement Learning; Factored Domains; Model Learning; Hierarchical Skill Learning; Monte Carlo Methods;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) is a well-established paradigm for enabling autonomous agents to learn from experience. To enable RL to scale to any but the smallest domains, it is necessary to make use of abstraction and generalization of the state-action space, for example with a factored representation. However, to make effective use of such a representation, it is necessary to determine which state variables are relevant in which situations. In this work, we introduce T-UCT, a novel model-based RL approach for learning and exploiting the dynamics of structured hierarchical environments. When learning the dynamics while acting, a partial or inaccurate model may do more harm than good. T-UCT uses graph-based planning and Monte Carlo simulations to exploit models that may be incomplete or inaccurate, allowing it to both maximize cumulative rewards and ignore trajectories that are unlikely to succeed. T-UCT incorporates new experiences in the form of more accurate plans that span a greater area of the state space. T-UCT is fully implemented and compared empirically against B-VISA, the best known prior approach to the same problem. We show that T-UCT learns hierarchical models with fewer samples than B-VISA and that this effect is magnified at deeper levels of hierarchical complexity.
引用
收藏
页码:1985 / 1986
页数:2
相关论文
共 50 条
  • [41] MONTE CARLO MODEL OF A FRACTURE PROCESS
    HARRIS, CC
    NATURE, 1966, 209 (5030) : 1302 - &
  • [42] Monte Carlo simulation of a model of water
    Maggs, AC
    PHYSICAL REVIEW E, 2005, 72 (04):
  • [43] A Monte Carlo model for ‘jet quenching’
    Korinna Zapp
    Gunnar Ingelman
    Johan Rathsman
    Johanna Stachel
    Urs Achim Wiedemann
    The European Physical Journal C, 2009, 60
  • [44] Correlations in the Monte Carlo Glauber model
    Blaizot, Jean-Paul
    Broniowski, Wojciech
    Ollitrault, Jean-Yves
    PHYSICAL REVIEW C, 2014, 90 (03)
  • [45] A Monte Carlo study of model dendrimers
    Carl, W
    JOURNAL OF THE CHEMICAL SOCIETY-FARADAY TRANSACTIONS, 1996, 92 (21): : 4151 - 4154
  • [46] Sequential Monte Carlo with model tempering
    Mlikota, Marko
    Schorfheide, Frank
    STUDIES IN NONLINEAR DYNAMICS AND ECONOMETRICS, 2024, 28 (02): : 249 - 269
  • [47] A Monte Carlo based tumor model
    Titz, B.
    Jeraj, R.
    MEDICAL PHYSICS, 2006, 33 (06) : 2056 - 2056
  • [48] Shell model the Monte Carlo way
    Ormand, WE
    PROGRESS OF THEORETICAL PHYSICS SUPPLEMENT, 1996, (124): : 37 - 74
  • [49] The Ising model with Hybrid Monte Carlo
    Ostmeyer, Johann
    Berkowitz, Evan
    Luu, Thomas
    Petschlies, Marcus
    Pittler, Ferenc
    COMPUTER PHYSICS COMMUNICATIONS, 2021, 265
  • [50] A Monte Carlo model for 'jet quenching'
    Zapp, Korinna
    Ingelman, Gunnar
    Rathsman, Johan
    Stachel, Johanna
    Wiedemann, Urs Achim
    EUROPEAN PHYSICAL JOURNAL C, 2009, 60 (04): : 617 - 632