Monte-Carlo Simulation Balancing Revisited

被引:0
|
作者
Graf, Tobias [1 ]
Platzner, Marco [2 ]
机构
[1] Univ Paderborn, Int Grad Sch Dynam Intelligent Syst, Paderborn, Germany
[2] Univ Paderborn, Paderborn, Germany
关键词
GAME;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simulation Balancing is an optimization algorithm to automatically tune the parameters of a playout policy used inside a Monte Carlo Tree Search. The algorithm fits a policy so that the expected result of a policy matches given target values of the training set. Up to now it has been successfully applied to Computer Go on small 9 x 9 boards but failed for larger board sizes like 1 9 x 19. On these large boards apprenticeship learning, which fits a policy so that it closely follows an expert, continues to be the algorithm of choice. In this paper we introduce several improvements to the original simulation balancing algorithm and test their effectiveness in Computer Go. The proposed additions remove the necessity to generate target values by deep searches, optimize faster and make the algorithm less prone to overfitting. The experiments show that simulation balancing improves the playing strength of a Go program using apprenticeship learning by more than 200 ELO on the large board size 1 9 x 19.
引用
收藏
页数:7
相关论文
共 50 条