Monte Carlo Tree Search with Robust Exploration

被引:0
|
作者
Imagawa, Takahisa [1 ,2 ]
Kaneko, Tomoyuki [1 ]
机构
[1] Univ Tokyo, Grad Sch Arts & Sci, Tokyo, Japan
[2] Japan Soc Promot Sci, Tokyo, Japan
来源
COMPUTERS AND GAMES, CG 2016 | 2016年 / 10068卷
关键词
D O I
10.1007/978-3-319-50935-8_4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents a new Monte-Carlo tree search method that focuses on identifying the best move. UCT which minimizes the cumulative regret, has achieved remarkable success in Go and other games. However, recent studies on simple regret reveal that there are better exploration strategies. To further improve the performance, a leaf to be explored is determined not only by the mean but also by the whole reward distribution. We adopted a hybrid approach to obtain reliable distributions. A negamax-style backup of reward distributions is used in the shallower half of a search tree, and UCT is adopted in the rest of the tree. Experiments on synthetic trees show that this presented method outperformed UCT and similar methods, except for trees having uniform width and depth.
引用
收藏
页码:34 / 46
页数:13
相关论文
共 50 条
  • [41] Monte Carlo Tree Search Techniques in the Game of Kriegspiel
    Ciancarini, Paolo
    Favini, Gian Piero
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 474 - 479
  • [42] Monte Carlo Tree Search for Scheduling Activity Recognition
    Amer, Mohamed R.
    Todorovic, Sinisa
    Fern, Alan
    Zhu, Song-Chun
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 1353 - 1360
  • [43] Transpositions and Move Groups in Monte Carlo Tree Search
    Childs, Benjamin E.
    Brodeur, James H.
    Kocsis, Levente
    2008 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2008, : 389 - +
  • [44] Monte Carlo Tree Search for Priced Timed Automata
    Jensen, Peter Gjol
    Kiviriga, Andrej
    Larsen, Kim Guldstrand
    Nyman, Ulrik
    Mijacika, Adriana
    Mortensen, Jeppe Hoiriis
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2022), 2022, 13479 : 381 - 398
  • [45] Using Local Regression in Monte Carlo Tree Search
    Randrianasolo, Arisoa S.
    Pyeatt, Larry D.
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 500 - 503
  • [46] Multiple Policy Value Monte Carlo Tree Search
    Lan, Li-Cheng
    Li, Wei
    Wei, Ting-Han
    Wu, I-Chen
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4704 - 4710
  • [47] Monte-Carlo Tree Search for Policy Optimization
    Ma, Xiaobai
    Driggs-Campbell, Katherine
    Zhang, Zongzhang
    Kochenderfer, Mykel J.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3116 - 3122
  • [48] Scalability and Parallelization of Monte-Carlo Tree Search
    Bourki, Amine
    Chaslot, Guillaume
    Coulm, Matthieu
    Danjean, Vincent
    Doghmen, Hassen
    Hoock, Jean-Baptiste
    Herault, Thomas
    Rimmel, Arpad
    Teytaud, Fabien
    Teytaud, Olivier
    Vayssiere, Paul
    Yu, Ziqin
    COMPUTERS AND GAMES, 2011, 6515 : 48 - 58
  • [49] On Monte Carlo Tree Search for Weighted Vertex Coloring
    Grelier, Cyril
    Goudet, Olivier
    Hao, Jin-Kao
    EVOLUTIONARY COMPUTATION IN COMBINATORIAL OPTIMIZATION, EVOCOP 2022, 2022, 13222 : 1 - 16
  • [50] Monte Carlo Tree Search for Bayesian Reinforcement Learning
    Vien, Ngo Anh
    Ertel, Wolfgang
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 138 - 143