LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

被引:0
|
作者
Niu, Yazhe [1 ,3 ]
Pu, Yuan [2 ]
Yang, Zhenjie [1 ]
Li, Xueyan [2 ]
Zhou, Tong [1 ]
Ren, Jiyuan [2 ]
Hu, Shuai [1 ]
Li, Hongsheng [3 ,4 ]
Liu, Yu [1 ,2 ]
机构
[1] SenseTime Grp LTD, Hong Kong, Peoples R China
[2] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[4] Ctr Perceptual & Interact Intelligence, Hong Kong, Peoples R China
关键词
LEVEL; GAME;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building agents based on tree-search planning capabilities with learned models has achieved remarkable success in classic decision-making problems, such as Go and Atari. However, it has been deemed challenging or even infeasible to extend Monte Carlo Tree Search (MCTS) based algorithms to diverse real-world applications, especially when these environments involve complex action spaces and significant simulation costs, or inherent stochasticity. In this work, we introduce LightZero, the first unified benchmark for deploying MCTS/MuZero in general sequential decision scenarios. Specificially, we summarize the most critical challenges in designing a general MCTS-style decision-making solver, then decompose the tightly-coupled algorithm and system design of tree-search RL methods into distinct sub-modules. By incorporating more appropriate exploration and optimization strategies, we can significantly enhance these sub-modules and construct powerful LightZero agents to tackle tasks across a wide range of domains, such as board games, Atari, MuJoCo, MiniGrid and GoBigger. Detailed benchmark results reveal the significant potential of such methods in building scalable and efficient decision intelligence. The code is available as part of OpenDILab at https://github.com/opendilab/LightZero.
引用
收藏
页数:42
相关论文
共 50 条
  • [1] Monte Carlo Tree Search for Feature Model Analyses: a General Framework for Decision-Making
    Horcas, Jose-Miguel
    Galindo, Jose A.
    Heradio, Ruben
    Fernandez-Amoros, David
    Benavides, David
    SPLC '21: PROCEEDINGS OF THE 25TH ACM INTERNATIONAL SYSTEMS AND SOFTWARE PRODUCT LINE CONFERENCE, VOL A, 2021,
  • [2] A Monte Carlo tree search approach to learning decision trees
    Nunes, Cecilia
    De Craene, Mathieu
    Langet, Helene
    Camara, Oscar
    Jonsson, Anders
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 429 - 435
  • [3] A Decision Heuristic for Monte Carlo Tree Search Doppelkopf Agents
    Dockhorn, Alexander
    Doell, Christoph
    Hewelt, Matthias
    Kruse, Rudolf
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 51 - 58
  • [4] Sequential Monte Carlo: A Unified Review
    Wills, Adrian G.
    Schon, Thomas B.
    ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 6 : 159 - 182
  • [5] Scalable Monte Carlo Tree Search for CAVs Action Planning in Colliding Scenarios
    Patel, Dhruvkumar
    Zalila-Wenkstern, Rym
    2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 1065 - 1072
  • [6] A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search
    Dam, Tuan
    D’Eramo, Carlo
    Peters, Jan
    Pajarinen, Joni
    Journal of Artificial Intelligence Research, 2024, 81 : 511 - 577
  • [7] A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search
    Dam, Than
    D'Eramo, Carlo
    Peters, Jan
    Pajarinen, Joni
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 81
  • [8] Monte Carlo Tree Search with Options for General Video Game Playing
    de Waard, Maarten
    Roijers, Diederik M.
    Bakkes, Sander C. J.
    2016 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2016,
  • [9] Multiagent Monte Carlo Tree Search
    Zerbel, Nicholas
    Yliniemi, Logan
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2309 - 2311
  • [10] Monte Carlo Tree Search with Metaheuristics
    Mandziuk, Jacek
    Walczak, Patryk
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2023, PT II, 2023, 14126 : 134 - 144