A Heterogeneous Acceleration System for Attention-Based Multi-Agent Reinforcement Learning

被引:0
|
作者
Wiggins, Samuel [1 ]
Meng, Yuan [1 ]
Iyer, Mahesh A. [2 ]
Prasanna, Viktor [1 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90007 USA
[2] Intel Corp, Santa Clara, CA USA
基金
美国国家科学基金会;
关键词
Multi-Agent Reinforcement Learning; Hardware Accelerator; Heterogeneous Computing;
D O I
10.1109/FPL64840.2024.00040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-Agent Reinforcement Learning (MARL) is an emerging technology that has seen success in many AI applications. Multi-Actor-Attention-Critic (MAAC) is a state-of-the-art MARL algorithm that uses a Multi-Head Attention (MHA) mechanism to learn messages communicated among agents during the training process. Current implementations of MAAC using CPU and CPU-GPU platforms lack fine-grained parallelism among agents, sequentially executing each stage of the training loop, and their performance suffers from costly data movement involved in MHA communication learning. In this work, we develop the first high-throughput accelerator for MARL with attention-based communication on a CPU-FPGA heterogeneous system. We alleviate the limitations of existing implementations through a combination of data- and pipeline-parallel modules in our accelerator design and enable fine-grained system scheduling for exploiting concurrency among heterogeneous resources. Our design increases the overall system throughput by 4.6x and 4.1x compared to CPU and CPU-GPU implementations, respectively.
引用
收藏
页码:236 / 242
页数:7
相关论文
共 50 条
  • [31] Optimal robust formation control for heterogeneous multi-agent systems based on reinforcement learning
    Yan, Bing
    Shi, Peng
    Lim, Cheng-Chew
    Shi, Zhiyuan
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (05) : 2683 - 2704
  • [32] The Cooperative Reinforcement Learning in a Multi-Agent Design System
    Liu, Hong
    Wang, Jihua
    PROCEEDINGS OF THE 2013 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2013, : 139 - 144
  • [33] Cooperative Multi-Agent Reinforcement Learning in Express System
    Li, Yexin
    Zheng, Yu
    Yang, Qiang
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 805 - 814
  • [34] Multi-Agent Reinforcement Learning-Based Joint Caching and Routing in Heterogeneous Networks
    Yang, Meiyi
    Gao, Deyun
    Foh, Chuan Heng
    Quan, Wei
    Leung, Victor C. M.
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (05) : 1959 - 1974
  • [35] Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning
    Zhang, Han
    Zhang, Xiaohui
    Feng, Zhao
    Xiao, Xiaohui
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (01): : 159 - 166
  • [36] Transfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning
    Kono, Hitoshi
    Kamimura, Akiya
    Tomita, Kohji
    Murata, Yuta
    Suzuki, Tsuyoshi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (10) : 156 - 164
  • [37] GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation
    Chen, Haoqiang
    Liu, Yadong
    Zhou, Zongtan
    Hu, Dewen
    Zhang, Ming
    APPLIED INTELLIGENCE, 2020, 50 (12) : 4195 - 4205
  • [38] Attention-Cooperated Reinforcement Learning for Multi-agent Path Planning
    Ma, Jinchao
    Lian, Defu
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS. DASFAA 2022 INTERNATIONAL WORKSHOPS, 2022, 13248 : 272 - 290
  • [39] AHAC: Actor Hierarchical Attention Critic for Multi-Agent Reinforcement Learning
    Wang, Yajie
    Shi, Dianxi
    Xue, Chao
    Jiang, Hao
    Wang, Gongju
    Gong, Peng
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3013 - 3020
  • [40] GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation
    Haoqiang Chen
    Yadong Liu
    Zongtan Zhou
    Dewen Hu
    Ming Zhang
    Applied Intelligence, 2020, 50 : 4195 - 4205