R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games

被引：0

作者：

Dai, Zhongxiang ^{[1
]}

Chen, Yizhou ^{[1
]}

Low, Bryan Kian Hsiang ^{[1
]}

Jaillet, Patrick ^{[2
]}

Ho, Teck-Hua ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Comp Sci, Singapore, Singapore

[2] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA

[3] Natl Univ Singapore, NUS Business Sch, Singapore, Singapore

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119 | 2020年 / 119卷

基金：

新加坡国家研究基金会;

关键词：

FRAMEWORK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a recursive reasoning formalism of Bayesian optimization (BO) to model the reasoning process in the interactions between boundedly rational, self-interested agents with unknown, complex, and costly-to-evaluate payoff functions in repeated games, which we call Recursive Reasoning-Based BO (R2-B2). Our R2-B2 algorithm is general in that it does not constrain the relationship among the payoff functions of different agents and can thus be applied to various types of games such as constant-sum, generalsum, and common-payoff games. We prove that by reasoning at level 2 or more and at one level higher than the other agents, our R2-B2 agent can achieve faster asymptotic convergence to no regret than that without utilizing recursive reasoning. We also propose a computationally cheaper variant of R2-B2 called R2-B2-Lite at the expense of a weaker convergence guarantee. The performance and generality of our R2-B2 algorithm are empirically demonstrated using synthetic games, adversarial machine learning, and multi-agent reinforcement learning.

引用

页数：11

共 9 条

[1] R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games
Dai, Zhongxiang
Chen, Yizhou
Low, Bryan Kian Hsiang
Jaillet, Patrick
Ho, Teck-Hua
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[2] A Marriage between Adversarial Team Games and 2-player Games: Enabling Abstractions, No-regret Learning, and Subgame Solving
Carminati, Luca
Cacciamani, Federico
Ciccone, Marco
Gatti, Nicola
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[3] R2-B2: A Metric of Synthesized Image's Photorealism by Regression Analysis based on Recognized Objects' Bounding Box
Hattori, Shun
Aiba, Kizuku
Takahara, Madoka
2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
[4] A Full Transfer Learning LSTM-Based FractionalOrder Optimization Method of GM(r,2) forInferring Driving Intention
Lian, Yufeng
Sun, Zhongbo
Liu, Shuaishi
Nie, Zhigen
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (09) : 10741 - 10753
[5] Optimization of structural and electrical properties of graphene-based TiO2 thin film device using Bayesian machine-learning approach
Wahab, Hud
Heil, Jacob
Tyrrell, Alexander Scott
Muller, Todd
Ackerman, John
Kotthoff, Lars
Johnson, Patrick A.
CERAMICS INTERNATIONAL, 2024, 50 (06) : 9114 - 9124
[6] Magnetic Resonance-based Response Assessment and Dose Adaptation in Human Papilloma Virus Positive Tumors of the Oropharynx treated with Radiotherapy (MR-ADAPTOR): An R-IDEAL stage 2a-2b/Bayesian phase II trial
Bahig, Houda
Yuan, Ying
Mohamed, Abdallah S. R.
Brock, Kristy K.
Ng, Sweet Ping
Wang, Jihong
Ding, Yao
Hutcheson, Kate
McCulloch, Molly
Balter, Peter A.
Lai, Stephen Y.
Al-Mamgani, Abrahim
Sonke, Jan-Jakob
van der Heide, Uulke A.
Nutting, Christopher
Li, X. Allen
Robbins, Jared
Awan, Mussadiq
Karam, Irene
Newbold, Katherine
Harrington, Kevin
Oelfke, Uwe
Bhide, Shreerang
Philippens, Marielle E. P.
Terhaard, Chris H. J.
McPartlin, Andrew J.
Blanchard, Pierre
Garden, Adam S.
Rosenthal, David I.
Gunn, Gary B.
Phan, Jack
Cazoulat, Guillaume
Aristophanous, Michalis
McSpadden, Kelli K.
Garcia, John A.
van den Berg, Cornelis A. T.
Raaijmakers, Cornelis P. J.
Kerkmeijer, Linda
Doornaert, Patricia
Blinde, Sanne
Frank, Steven J.
Fuller, Clifton D.
CLINICAL AND TRANSLATIONAL RADIATION ONCOLOGY, 2018, 13 : 19 - 23
[7] Magnetic Resonance-based Response Assessment and Dose Adaptation in Human Papilloma Virus Positive Tumors of the Oropharynx treated with Radiotherapy (MR-ADAPTOR): An R-IDEAL Stage 2a-2b/Bayesian Phase II Trial (vol 13, pg 19, 2018)
Bahig, H.
CLINICAL AND TRANSLATIONAL RADIATION ONCOLOGY, 2021, 27 : 96 - 96
[8] Magnetic Resonance-based Response Assessment and Dose Adaptation in Human Papilloma Virus Positive Tumors of the Oropharynx treated with Radiotherapy (MR-ADAPTOR): An R-IDEAL Stage 2a-2b/Bayesian Phase II Trial (vol 13C, pg 19, 2018)
Tondel, H.
CLINICAL AND TRANSLATIONAL RADIATION ONCOLOGY, 2021, 27 : 96 - 96
[9] Optimization of a Series of Bivalent Triazolopyridazine Based Bromodomain and Extraterminal Inhibitors: The Discovery of (3R)-4-[2-[4-[1-(3-Methoxy-[1,2,4]triazolo[4,3-b]pyridazin-6-yl)-4-piperidyl]phenoxylethy]-1,3-dimethyl-piperazin-2-one (AZD5153)
Bradbury, Robert H.
Callis, Rowena
Carr, Gregory R.
Chen, Huawei
Clark, Edwin
Feron, Lyman
Glossop, Steve
Graham, Mark A.
Hattersley, Maureen
Jones, Chris
Lamont, Scott G.
Ouvry, Gilles
Patel, Anil
Patel, Joe
Rabow, Alfred A.
Roberts, Craig A.
Stokes, Stephen
Stratton, Natalie
Walker, Graeme E.
Ward, Lara
Whalley, David
Whittaker, David
Wrigley, Gail
Waring, Michael J.
JOURNAL OF MEDICINAL CHEMISTRY, 2016, 59 (17) : 7801 - 7817

← 1 →