Efficient Communication via Self-Supervised Information Aggregation for Online and Offline Multiagent Reinforcement Learning

被引:0
|
作者
Guan, Cong [1 ,2 ]
Chen, Feng [1 ,2 ]
Yuan, Lei [3 ]
Zhang, Zongzhang [1 ,2 ]
Yu, Yang [3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[2] Nanjing Univ, Sch Artificial Intelligence, Nanjing 210023, Peoples R China
[3] Polixir Technol, Nanjing 211106, Peoples R China
基金
美国国家科学基金会;
关键词
Benchmark testing; Reinforcement learning; Observability; Training; Learning (artificial intelligence); Decision making; Data mining; Cooperative multiagent reinforcement learning (MARL); multiagent communication; offline learning; representation learning;
D O I
10.1109/TNNLS.2024.3420791
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Utilizing messages from teammates can improve coordination in cooperative multiagent reinforcement learning (MARL). Previous works typically combine raw messages of teammates with local information as inputs for policy. However, neglecting message aggregation poses significant inefficiency for policy learning. Motivated by recent advances in representation learning, we argue that efficient message aggregation is essential for good coordination in cooperative MARL. In this article, we propose Multiagent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy. Specifically, we design a permutation-invariant message encoder to generate common information-aggregated representation from messages and optimize it via reconstructing and shooting future information in a self-supervised manner. Hence, each agent would utilize the most relevant parts of the aggregated representation for decision-making by a novel message extraction mechanism. Furthermore, considering the potential of offline learning for real-world applications, we build offline benchmarks for multiagent communication, which is the first as we know. Empirical results demonstrate the superiority of our method in both online and offline settings. We also release the built offline benchmarks in this article as a testbed for communication ability validation to facilitate further future research in this direction.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Efficient Multi-agent Communication via Self-supervised Information Aggregation
    Guan, Cong
    Chen, Feng
    Yuan, Lei
    Wang, Chenghe
    Yin, Hao
    Zhang, Zongzhang
    Yu, Yang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [2] Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling
    Yu, Xudong
    Bai, Chenjia
    Wang, Changhong
    Yu, Dengxiu
    Chen, C. L. Philip
    Wang, Zhen
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (12): : 7732 - 7743
  • [3] Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
    Qiu, Shuang
    Wang, Lingxiao
    Bai, Chenjia
    Yang, Zhuoran
    Wang, Zhaoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [4] Efficient Self-Supervised Data Collection for Offline Robot Learning
    Endrawis, Shadi
    Leibovich, Gal
    Jacob, Guy
    Novik, Gal
    Tamar, Aviv
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4650 - 4656
  • [5] Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning
    Gao Z.
    Xu K.
    Zhai Y.
    Ding B.
    Feng D.
    Mao X.
    Wang H.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (11): : 1 - 10
  • [6] State Augmentation via Self-Supervision in Offline Multiagent Reinforcement Learning
    Wang, Siying
    Li, Xiaodie
    Qu, Hong
    Chen, Wenyu
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (03) : 1051 - 1062
  • [7] Accelerating Self-Supervised Learning via Efficient Training Strategies
    Kocyigit, Mustafa Taha
    Hospedales, Timothy M.
    Bilen, Hakan
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 5643 - 5653
  • [8] Efficient Medical Image Assessment via Self-supervised Learning
    Huang, Chun-Yin
    Lei, Qi
    Li, Xiaoxiao
    DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS (DALI 2022), 2022, 13567 : 102 - 111
  • [9] Efficient Online Reinforcement Learning with Offline Data
    Ball, Philip J.
    Smith, Laura
    Kostrikov, Ilya
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [10] Self-Supervised Graph Representation Learning via Information Bottleneck
    Gu, Junhua
    Zheng, Zichen
    Zhou, Wenmiao
    Zhang, Yajuan
    Lu, Zhengjun
    Yang, Liang
    SYMMETRY-BASEL, 2022, 14 (04):