No-Regret Learning in Time-Varying Zero-Sum Games

被引:0
|
作者
Zhang, Mengxiao [1 ]
Zhao, Peng [2 ]
Luo, Haipeng [1 ]
Zhou, Zhi-Hua [2 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
[2] Nanjing Univ, Nat Key Lab Novel Software Technol, Nanjing, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from repeated play in a fixed two-player zero-sum game is a classic problem in game theory and online learning. We consider a variant of this problem where the game payoff matrix changes over time, possibly in an adversarial manner. We first present three performance measures to guide the algorithmic design for this problem: 1) the well-studied individual regret, 2) an extension of duality gap, and 3) a new measure called dynamic Nash Equilibrium regret, which quantifies the cumulative difference between the player's payoff and the minimax game value. Next, we develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under all these three performance measures. These guarantees are adaptive to different nonstationarity measures of the payoff matrices and, importantly, recover the best known results when the payoff matrix is fixed. Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property, along with several novel ingredients specifically designed for the time-varying game setting. Empirical results further validate the effectiveness of our algorithm.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] No-Regret Distributed Learning in Subnetwork Zero-Sum Games
    Huang, Shijie
    Lei, Jinlong
    Hong, Yiguang
    Shanbhag, Uday V.
    Chen, Jie
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (10) : 6620 - 6635
  • [2] No-Regret Distributed Learning in Two-Network Zero-Sum Games
    Huang, Shijie
    Lei, Jinlong
    Hong, Yiguang
    Shanbhag, Uday, V
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 924 - 929
  • [3] On the Convergence of No-Regret Learning Dynamics in Time-Varying Games
    Anagnostides, Ioannis
    Panageas, Ioannis
    Farina, Gabriele
    Sandholm, Tuomas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Near-optimal no-regret algorithms for zero-sum games
    Daskalakis, Constantinos
    Deckelbaum, Alan
    Kim, Anthony
    GAMES AND ECONOMIC BEHAVIOR, 2015, 92 : 327 - 348
  • [5] Near-Optimal No-Regret Algorithms for Zero-Sum Games
    Daskalakis, Constantinos
    Deckelbaum, Alan
    Kim, Anthony
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2011, : 235 - 254
  • [6] Let's be Honest: An Optimal No-Regret Framework for Zero-Sum Games
    Kangarshahi, Ehsan Asadi
    Hsieh, Ya-Ping
    Sahin, Mehmet Fatih
    Cevher, Volkan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] Logarithmic-Regret Quantum Learning Algorithms for Zero-Sum Games
    Gao, Minbo
    Ji, Zhengfeng
    Li, Tongyang
    Wang, Qisheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] No-Regret Algorithms for Time-Varying Bayesian Optimization
    Zhou, Xingyu
    Shroff, Ness
    2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
  • [9] Regret Minimization in Behaviorally-Constrained Zero-Sum Games
    Farina, Gabriele
    Kroer, Christian
    Sandholm, Tuomas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [10] No-Regret Learning in Bayesian Games
    Hartline, Jason
    Syrgkanis, Vasilis
    Tardos, Eva
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28