Monte-Carlo Simulation Balancing Revisited

被引:0
|
作者
Graf, Tobias [1 ]
Platzner, Marco [2 ]
机构
[1] Univ Paderborn, Int Grad Sch Dynam Intelligent Syst, Paderborn, Germany
[2] Univ Paderborn, Paderborn, Germany
关键词
GAME;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simulation Balancing is an optimization algorithm to automatically tune the parameters of a playout policy used inside a Monte Carlo Tree Search. The algorithm fits a policy so that the expected result of a policy matches given target values of the training set. Up to now it has been successfully applied to Computer Go on small 9 x 9 boards but failed for larger board sizes like 1 9 x 19. On these large boards apprenticeship learning, which fits a policy so that it closely follows an expert, continues to be the algorithm of choice. In this paper we introduce several improvements to the original simulation balancing algorithm and test their effectiveness in Computer Go. The proposed additions remove the necessity to generate target values by deep searches, optimize faster and make the algorithm less prone to overfitting. The experiments show that simulation balancing improves the playing strength of a Go program using apprenticeship learning by more than 200 ELO on the large board size 1 9 x 19.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] THE SPATIAL-DISTRIBUTION OF BACKSCATTERED ELECTRONS REVISITED WITH A NEW MONTE-CARLO SIMULATION
    MURATA, K
    YASUDA, M
    KAWATA, H
    SCANNING MICROSCOPY, 1992, 6 (04) : 943 - 954
  • [22] DYNAMIC MONTE-CARLO SIMULATION OF POLYCARBONATE
    KOTELYANSKII, MJ
    SUTER, UW
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1992, 203 : 294 - POLY
  • [23] MONTE-CARLO SIMULATION OF ADHESIVE SPHERES
    SEATON, NA
    GLANDT, ED
    JOURNAL OF CHEMICAL PHYSICS, 1987, 87 (03): : 1785 - 1790
  • [24] MONTE-CARLO SIMULATION OF AN ACTIVE SONAR
    HUDSON, JE
    RADIO AND ELECTRONIC ENGINEER, 1970, 40 (05): : 265 - +
  • [25] MONTE-CARLO SIMULATION OF INTERFACE ALLOYING
    FRONTERA, C
    VIVES, E
    CASTAN, T
    PLANES, A
    PHYSICAL REVIEW B, 1995, 51 (17): : 11369 - 11375
  • [26] MONTE-CARLO SIMULATION OF OPTICAL TRAPPING
    LIGHTBODY, M
    PERT, GJ
    INSTITUTE OF PHYSICS CONFERENCE SERIES, 1990, (116): : 305 - 308
  • [27] MONTE-CARLO SIMULATION OF COLLOIDAL SYSTEMS
    DICKINSON, E
    EUSTON, SR
    ADVANCES IN COLLOID AND INTERFACE SCIENCE, 1992, 42 : 89 - 148
  • [28] MONTE-CARLO SIMULATION OF STATISTICAL POWER
    BORENSTEIN, M
    KANE, J
    BUCHBINDER, J
    PSYCHOPHARMACOLOGY BULLETIN, 1987, 23 (02) : 300 - 302
  • [29] MONTE-CARLO SIMULATION OF THE KONDO NECKLACE
    SCALETTAR, RT
    SCALAPINO, DJ
    SUGAR, RL
    PHYSICAL REVIEW B, 1985, 31 (11): : 7316 - 7322
  • [30] MONTE-CARLO SIMULATION OF LIQUID ETHANE
    BYRNES, JM
    SANDLER, SI
    JOURNAL OF CHEMICAL PHYSICS, 1984, 80 (02): : 881 - 885