Bridge Bidding via Deep Reinforcement Learning and Belief Monte Carlo Search

被引：1

作者：

Qiu, Zizhang ^{[1
]}

Wang, Shouguang ^{[1
]}

You, Dan ^{[1
]}

Zhou, MengChu ^{[1
]}

机构：

[1] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2024年 / 11卷 / 10期

关键词：

Bridges; Monte Carlo methods; Supervised learning; Interference; Games; Deep reinforcement learning; Software; Contract Bridge; reinforcement learning; search; GO; ALGORITHM; GAME;

D O I：

10.1109/JAS.2024.124488

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Contract Bridge, a four-player imperfect information game, comprises two phases: bidding and playing. While computer programs excel at playing, bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents. In this work, we introduce a Bridge bidding agent that combines supervised learning, deep reinforcement learning via self-play, and a test-time search approach. Our experiments demonstrate that our agent outperforms WBridge5, a highly regarded computer Bridge software that has won multiple world championships, by a performance of 0.98 IMPs (international match points) per deal over 10 000 deals, with a much cost-effective approach. The performance significantly surpasses previous state-of-the-art (0.85 IMPs per deal). Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.

引用

页码：2111 / 2122

页数：12

共 50 条

[41] Tensor Implementation of Monte-Carlo Tree Search for Model-Based Reinforcement Learning
Balaz, Marek
Tarabek, Peter
APPLIED SCIENCES-BASEL, 2023, 13 (03):
[42] IMPROVING ACTOR-CRITIC REINFORCEMENT LEARNING VIA HAMILTONIAN MONTE CARLO METHOD
Xu, Duo
Fekri, Faramarz
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4018 - 4022
[43] Learning in POMDPs with Monte Carlo Tree Search
Katt, Sammie
Oliehoek, Frans A.
Amato, Christopher
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[44] Incentive Learning in Monte Carlo Tree Search
Kao, Kuo-Yuan
Wu, I-Chen
Yen, Shi-Jim
Shan, Yi-Chang
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, 2013, 5 (04) : 346 - 352
[45] Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method
Xu D.
Fekri F.
IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1642 - 1653
[46] Belief-State Monte Carlo Tree Search for Phantom Go
Wang, Jiao
Zhu, Tan
Li, Hongye
Hsueh, Chu-Husan
Wu, I. -Chen
IEEE TRANSACTIONS ON GAMES, 2018, 10 (02) : 139 - 154
[47] Design of a Block Go program using deep learning and Monte Carlo tree search
Lin, Ching-Nung
Chen, Jr-Chang
Yen, Shi-Jim
Chen, Chan-San
ICGA JOURNAL, 2018, 40 (03) : 149 - 159
[48] Deep learning inspired routing in ICN using Monte Carlo Tree Search algorithm
Dutta, Nitul
Patel, Shobhit K.
Samusenkov, Vadim
Vigneswaran, D.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 150 : 104 - 111
[49] Reinforcement learning, Sequential Monte Carlo and the EM algorithm
Borkar, Vivek S.
Jain, Ankush V.
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (08):
[50] Reinforcement learning, Sequential Monte Carlo and the EM algorithm
VIVEK S BORKAR
ANKUSH V JAIN
Sādhanā, 2018, 43

← 1 2 3 4 5 →