Multi-Agent Reinforcement Learning in Non-Cooperative Stochastic Games Using Large Language Models

被引：0

作者：

Alsadat, Shayan Meshkat ^{[1
]}

Xu, Zhe ^{[1
]}

机构：

[1] Arizona State Univ, Fac Mech Engn, Tempe, AZ 85281 USA

来源：

IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷

关键词：

Games; Nash equilibrium; Stochastic processes; Q-learning; Convergence; Learning automata; Large language models; Trajectory; Robustness; Probabilistic logic; Reinforcement learning; large language models; stochastic games; reward machines;

D O I：

10.1109/LCSYS.2024.3515879

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study the use of large language models (LLMs) to integrate high-level knowledge in stochastic games using reinforcement learning with reward machines to encode non-Markovian and Markovian reward functions. In non-cooperative games, one challenge is to provide agents with knowledge about the task efficiently to speed up the convergence to an optimal policy. We aim to provide this knowledge in the form of deterministic finite automata (DFA) generated by LLMs (LLM-generated DFA). Additionally, we use reward machines (RMs) to encode the temporal structure of the game and the non-Markovian or Markovian reward functions. Our proposed algorithm, LLM-generated DFA for Multi-agent Reinforcement Learning with Reward Machines for Stochastic Games (StochQ-RM), can learn an equivalent reward machine to the ground truth reward machine (specified task) in the environment using the LLM-generated DFA. Additionally, we propose DFA-based q-learning with reward machines (DBQRM) to find the best responses for each agent using Nash equilibrium in stochastic games. Despite the fact that the LLMs are known to hallucinate, we show that our method is robust and guaranteed to converge to an optimal policy. Furthermore, we study the performance of our proposed method in three case studies.

引用

页码：2757 / 2762

页数：6

共 50 条

[1] Generation of Coupling Topologies for Multi-Agent Systems using Non-Cooperative Games
Kloock, Maximilian
Dirksen, Matthis
Kowalewski, Stefan
Alrifaee, Bassam
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1 - 8
[2] Multi-agent Deep Reinforcement Learning for Non-Cooperative Power Control in Heterogeneous Networks
Zhang, Lin
Liang, Ying-Chang
2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
[3] Multi-Agent Evolutionary Reinforcement Learning Based on Cooperative Games
Yu, Jin
Zhang, Ya
Sun, Changyin
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
[4] Non-cooperative multi-agent deep reinforcement learning for channel resource allocation in vehicular networks
Zhang, Fuxin
Yao, Sihan
Liu, Wei
Qi, Liang
COMPUTER NETWORKS, 2025, 257
[5] Cooperative Multi-Agent Reinforcement Learning in a Large Stationary Environment
Zemzem, Wiem
Tagina, Moncef
2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 365 - 371
[6] Pacesetter Learning for Large Scale Cooperative Multi-Agent Reinforcement Learning
Zhou, Pingqi
Li, Chao
Qiu, Mengwei
Liu, Jun
Ma, Chennan
Yan, Ming
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 115 - 126
[7] Multi-agent Reinforcement Learning in stochastic single and multi-stage games
Verbeeck, K
Nowé, A
Peeters, M
Tuyls, K
ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS II: ADAPTATION AND MULTI-AGENT LEARNING, 2005, 3394 : 275 - 294
[8] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
Chen, Hao
Yang, Guangkai
Zhang, Junge
Yin, Qiyue
Huang, Kaiqi
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[9] MFVFD : A Multi-Agent Q-Learning Approach to Cooperative and Non-Cooperative Tasks
Zhang, Tianhao
Ye, Qiwei
Bian, Jiang
Xie, Guangming
Liu, Tie-Yan
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 500 - 506
[10] Multi-Agent Reinforcement Learning in Cournot Games
Shi, Yuanyuan
Zhang, Baosen
2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3561 - 3566

← 1 2 3 4 5 →