DATA-DRIVEN ROBUST MULTI-AGENT REINFORCEMENT LEARNING

被引:0
|
作者
Wang, Yudan [1 ]
Wang, Yue [1 ]
Zhou, Yi [2 ]
Velasquez, Alvaro [3 ]
Zou, Shaofeng [1 ]
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
[2] Univ Utah, Dept Elect & Comp Engn, Salt Lake City, UT 84112 USA
[3] Air Force Res Lab, Informat Directorate, Wright Patterson AFB, OH USA
基金
美国国家科学基金会;
关键词
Distributionally robust; model-free; sample complexity; finite-time analysis; robust MDP;
D O I
10.1109/MLSP55214.2022.9943500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning (MARL) in the collaborative setting aims to find a joint policy that maximizes the accumulated reward averaged over all the agents. In this paper, we focus on MARL under model uncertainty, where the transition kernel is assumed to be in an uncertainty set, and the goal is to optimize the worst-case performance over the uncertainty set. We investigate the model-free setting, where the uncertain set centers around an unknown Markov decision process from which a single sample trajectory can be obtained sequentially. We develop a robust multi-agent Qlearning algorithm, which is model-free and fully decentralized. We theoretically prove that the proposed algorithm converges to the minimax robust policy, and further characterize its sample complexity. Our algorithm, comparing to the vanilla multi-agent Q-learning, offers provable robustness under model uncertainty without incurring additional computational and memory cost.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Robust experience replay sampling for multi-agent reinforcement learning
    Nicholaus, Isack Thomas
    Kang, Dae-Ki
    PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
  • [22] Robust Multi-agent Patrolling Strategies Using Reinforcement Learning
    Lauri, Fabrice
    Koukam, Abderrafiaa
    SWARM INTELLIGENCE BASED OPTIMIZATION (ICSIBO 2014), 2014, 8472 : 157 - 165
  • [23] DDMA: Discrepancy-Driven Multi-agent Reinforcement Learning
    Li, Chao
    Hu, Yujing
    Tian, Pinzhuo
    Dong, Shaokang
    Gao, Yang
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 91 - 105
  • [24] Strangeness-driven exploration in multi-agent reinforcement learning
    Kim, Ju-Bong
    Choi, Ho-Bin
    Han, Youn-Hee
    NEURAL NETWORKS, 2024, 172
  • [25] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [26] Data-Driven Revision of Conditional Norms in Multi-Agent Systems
    Dell'Anna D.
    Alechina N.
    Dalpiaz F.
    Dastani M.
    Logan B.
    Journal of Artificial Intelligence Research, 2022, 75 : 1549 - 1593
  • [27] Data-Driven Revision of Conditional Norms in Multi-Agent Systems
    Dell'Anna, Davide
    Alechina, Natasha
    Dalpiaz, Fabiano
    Dastani, Mehdi
    Logan, Brian
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6868 - 6872
  • [28] Data-Driven Multi-Agent Vehicle Routing in a Congested City
    Solter, Alex
    Lin, Fuhua
    Wen, Dunwei
    Zhou, Xiaokang
    INFORMATION, 2021, 12 (11)
  • [29] Data-Driven Revision of Conditional Norms in Multi-Agent Systems
    Dell'Anna, Davide
    Alechina, Natasha
    Dalpiaz, Fabiano
    Dastani, Mehdi
    Logan, Brian
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 75 : 1549 - 1593
  • [30] Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration
    Zheng, Lulu
    Chen, Jiarui
    Wang, Jianhao
    He, Jiamin
    Hu, Yujing
    Chen, Yingfeng
    Fan, Changjie
    Gao, Yang
    Zhang, Chongjie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34