DATA-DRIVEN ROBUST MULTI-AGENT REINFORCEMENT LEARNING

被引:0
|
作者
Wang, Yudan [1 ]
Wang, Yue [1 ]
Zhou, Yi [2 ]
Velasquez, Alvaro [3 ]
Zou, Shaofeng [1 ]
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
[2] Univ Utah, Dept Elect & Comp Engn, Salt Lake City, UT 84112 USA
[3] Air Force Res Lab, Informat Directorate, Wright Patterson AFB, OH USA
基金
美国国家科学基金会;
关键词
Distributionally robust; model-free; sample complexity; finite-time analysis; robust MDP;
D O I
10.1109/MLSP55214.2022.9943500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning (MARL) in the collaborative setting aims to find a joint policy that maximizes the accumulated reward averaged over all the agents. In this paper, we focus on MARL under model uncertainty, where the transition kernel is assumed to be in an uncertainty set, and the goal is to optimize the worst-case performance over the uncertainty set. We investigate the model-free setting, where the uncertain set centers around an unknown Markov decision process from which a single sample trajectory can be obtained sequentially. We develop a robust multi-agent Qlearning algorithm, which is model-free and fully decentralized. We theoretically prove that the proposed algorithm converges to the minimax robust policy, and further characterize its sample complexity. Our algorithm, comparing to the vanilla multi-agent Q-learning, offers provable robustness under model uncertainty without incurring additional computational and memory cost.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Hierarchical multi-agent reinforcement learning
    Ghavamzadeh, Mohammad
    Mahadevan, Sridhar
    Makar, Rajbala
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2006, 13 (02) : 197 - 229
  • [42] Partitioning in multi-agent reinforcement learning
    Sun, R
    Peterson, T
    FROM ANIMALS TO ANIMATS 6, 2000, : 325 - 332
  • [43] The Dynamics of Multi-Agent Reinforcement Learning
    Dickens, Luke
    Broda, Krysia
    Russo, Alessandra
    ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 367 - 372
  • [44] Multi-agent reinforcement learning: A survey
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
  • [45] Data-Driven Cooperative Output Regulation of Multi-Agent Systems via Robust Adaptive Dynamic Programming
    Gao, Weinan
    Jiang, Yu
    Davari, Masoud
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (03) : 447 - 451
  • [46] Learning data-driven decision-making policies in multi-agent environments for autonomous systems
    Hook, Joosep
    El-Sedky, Seif
    De Silva, Varuna
    Kondoz, Ahmet
    COGNITIVE SYSTEMS RESEARCH, 2021, 65 : 40 - 49
  • [47] Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method
    Zhang, Huaguang
    Jiang, He
    Luo, Yanhong
    Xiao, Geyang
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 64 (05) : 4091 - 4100
  • [48] Towards Data-Driven Hybrid Composition of Data Mining Multi-agent Systems
    Neruda, Roman
    SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2009, 209 : 271 - 281
  • [49] Data-driven bipartite consensus control for multi-agent systems with data quantization
    Zhao H.-R.
    Peng L.
    Yu H.-N.
    Shen Y.-H.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2022, 39 (02): : 336 - 342
  • [50] The Role of Data-Driven Priors in Multi-Agent Crowd Trajectory Estimation
    Qiao, Gang
    Yoon, Sejong
    Kapadia, Mubbasir
    Pavlovic, Vladimir
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4710 - 4717