Cooperative Multi-Agent Q-Learning Using Distributed MPC

被引：2

作者：

Esfahani, Hossein Nejatbakhsh ^{[1
]}

Velni, Javad Mohammadpour ^{[1
]}

机构：

[1] Clemson Univ, Dept Mech Engn, Clemson, SC 29634 USA

来源：

IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷

基金：

美国国家科学基金会;

关键词：

Q-learning; Approximation algorithms; Couplings; Costs; Predictive control; Multi-agent systems; Linear programming; Multi-agent Q-Learning; distributed MPC; cooperative control;

D O I：

10.1109/LCSYS.2024.3407632

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this letter, we propose a cooperative Multi-Agent Reinforcement Learning (MARL) approach based on Distributed Model Predictive Control (DMPC). In the proposed framework, the local MPC schemes are formulated based on the dual decomposition method in the context of DMPC and will be used to derive the local state (and action) value functions required in a cooperative Q-learning algorithm. We further show that the DMPC scheme can yield a framework based on the Value Function Decomposition (VFD) principle so that the global state (and action) value functions can be decomposed into several local state (and action) value functions captured from the local MPCs. In the proposed cooperative MARL, the coordination between individual agents is then achieved based on the multiplier-sharing step, a.k.a inter-agent negotiation in the DMPC scheme.

引用

页码：2193 / 2198

页数：6

共 50 条

[41] Continuous strategy replicator dynamics for multi-agent Q-learning
Galstyan, Aram
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2013, 26 (01) : 37 - 53
[42] Multi-Agent Reward-Iteration Fuzzy Q-Learning
Leng, Lixiong
Li, Jingchen
Zhu, Jinhui
Hwang, Kao-Shing
Shi, Haobin
INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2021, 23 (06) : 1669 - 1679
[43] Q-Learning with Side Information in Multi-Agent Finite Games
Sylvestre, Mathieu
Pavel, Lacra
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 5032 - 5037
[44] Multi-agent Q-learning Based Navigation in an Unknown Environment
Nath, Amar
Niyogi, Rajdeep
Singh, Tajinder
Kumar, Virendra
ADVANCED INFORMATION NETWORKING AND APPLICATIONS, AINA-2022, VOL 1, 2022, 449 : 330 - 340
[45] CONTINUOUS ACTION GENERATION OF Q-LEARNING IN MULTI-AGENT COOPERATION
Hwang, Kao-Shing
Chen, Yu-Jen
Jiang, Wei-Cheng
Lin, Tzung-Feng
ASIAN JOURNAL OF CONTROL, 2013, 15 (04) : 1011 - 1020
[46] Extending Q-Learning to general adaptive multi-agent systems
Tesauro, G
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 871 - 878
[47] Multi-Agent Q-Learning for Power Allocation in Interference Channel
Wongphatcharatham, Tanutsorn
Phakphisut, Watid
Wijitpornchai, Thongchai
Areeprayoonkij, Poonlarp
Jaruvitayakovit, Tanun
Hannanta-Anan, Pimkhuan
2022 37TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2022), 2022, : 876 - 879
[48] Continuous strategy replicator dynamics for multi-agent Q-learning
Aram Galstyan
Autonomous Agents and Multi-Agent Systems, 2013, 26 : 37 - 53
[49] DVF:Multi-agent Q-learning with difference value factorization
Huang, Anqi
Wang, Yongli
Sang, Jianghui
Wang, Xiaoli
Wang, Yupeng
KNOWLEDGE-BASED SYSTEMS, 2024, 286
[50] Multi-Agent Coordination Method Based on Fuzzy Q-Learning
Peng, Jun
Liu, Miao
Wu, Min
Zhang, Xiaoyong
Lin, Kuo-Chi
2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 5411 - +

← 1 2 3 4 5 →