Multi-Agent Reinforcement Learning in Non-Cooperative Stochastic Games Using Large Language Models

被引:0
|
作者
Alsadat, Shayan Meshkat [1 ]
Xu, Zhe [1 ]
机构
[1] Arizona State Univ, Fac Mech Engn, Tempe, AZ 85281 USA
来源
关键词
Games; Nash equilibrium; Stochastic processes; Q-learning; Convergence; Learning automata; Large language models; Trajectory; Robustness; Probabilistic logic; Reinforcement learning; large language models; stochastic games; reward machines;
D O I
10.1109/LCSYS.2024.3515879
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the use of large language models (LLMs) to integrate high-level knowledge in stochastic games using reinforcement learning with reward machines to encode non-Markovian and Markovian reward functions. In non-cooperative games, one challenge is to provide agents with knowledge about the task efficiently to speed up the convergence to an optimal policy. We aim to provide this knowledge in the form of deterministic finite automata (DFA) generated by LLMs (LLM-generated DFA). Additionally, we use reward machines (RMs) to encode the temporal structure of the game and the non-Markovian or Markovian reward functions. Our proposed algorithm, LLM-generated DFA for Multi-agent Reinforcement Learning with Reward Machines for Stochastic Games (StochQ-RM), can learn an equivalent reward machine to the ground truth reward machine (specified task) in the environment using the LLM-generated DFA. Additionally, we propose DFA-based q-learning with reward machines (DBQRM) to find the best responses for each agent using Nash equilibrium in stochastic games. Despite the fact that the LLMs are known to hallucinate, we show that our method is robust and guaranteed to converge to an optimal policy. Furthermore, we study the performance of our proposed method in three case studies.
引用
收藏
页码:2757 / 2762
页数:6
相关论文
共 50 条
  • [1] Generation of Coupling Topologies for Multi-Agent Systems using Non-Cooperative Games
    Kloock, Maximilian
    Dirksen, Matthis
    Kowalewski, Stefan
    Alrifaee, Bassam
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1 - 8
  • [2] Multi-agent Deep Reinforcement Learning for Non-Cooperative Power Control in Heterogeneous Networks
    Zhang, Lin
    Liang, Ying-Chang
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [3] Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games
    Yu, Jin
    Zhang, Ya
    Sun, Changyin
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [4] Non-cooperative multi-agent deep reinforcement learning for channel resource allocation in vehicular networks
    Zhang, Fuxin
    Yao, Sihan
    Liu, Wei
    Qi, Liang
    COMPUTER NETWORKS, 2025, 257
  • [5] Cooperative Multi-Agent Reinforcement Learning in a Large Stationary Environment
    Zemzem, Wiem
    Tagina, Moncef
    2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 365 - 371
  • [6] Pacesetter Learning for Large Scale Cooperative Multi-Agent Reinforcement Learning
    Zhou, Pingqi
    Li, Chao
    Qiu, Mengwei
    Liu, Jun
    Ma, Chennan
    Yan, Ming
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 115 - 126
  • [7] Multi-agent Reinforcement Learning in stochastic single and multi-stage games
    Verbeeck, K
    Nowé, A
    Peeters, M
    Tuyls, K
    ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS II: ADAPTATION AND MULTI-AGENT LEARNING, 2005, 3394 : 275 - 294
  • [8] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [9] MFVFD : A Multi-Agent Q-Learning Approach to Cooperative and Non-Cooperative Tasks
    Zhang, Tianhao
    Ye, Qiwei
    Bian, Jiang
    Xie, Guangming
    Liu, Tie-Yan
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 500 - 506
  • [10] Multi-Agent Reinforcement Learning in Cournot Games
    Shi, Yuanyuan
    Zhang, Baosen
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3561 - 3566