A High-Performance and Energy-Efficient Photonic Architecture for Multi-DNN Acceleration

被引:0
|
作者
Li, Yuan [1 ]
Louri, Ahmed [1 ]
Karanth, Avinash [2 ]
机构
[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
[2] Ohio Univ, Sch Elect Engn & Comp Sci, Athens, OH 45701 USA
基金
美国国家科学基金会;
关键词
Accelerator; dataflow; deep neural network; silicon photonics;
D O I
10.1109/TPDS.2023.3327535
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Large-scale deep neural network (DNN) accelerators are poised to facilitate the concurrent processing of diverse DNNs, imposing demanding challenges on the interconnection fabric. These challenges encompass overcoming performance degradation and energy increase associated with system scaling while also necessitating flexibility to support dynamic partitioning and adaptable organization of compute resources. Nevertheless, conventional metallic-based interconnects frequently confront inherent limitations in scalability and flexibility. In this paper, we leverage silicon photonic interconnects and adopt an algorithm-architecture co-design approach to develop MDA, a DNN accelerator meticulously crafted to empower high-performance and energy-efficient concurrent processing of diverse DNNs. Specifically, MDA consists of three novel components: 1) a resource allocation algorithm that assigns compute resources to concurrent DNNs based on their computational demands and priorities; 2) a dataflow selection algorithm that determines off-chip and on-chip dataflows for each DNN, with the objectives of minimizing off-chip and on-chip memory accesses, respectively; 3) a flexible silicon photonic network that can be dynamically segmented into sub-networks, each interconnecting the assigned compute resources of a certain DNN while adapting to the communication patterns dictated by the selected on-chip dataflow. Simulation results show that the proposed MDA accelerator outperforms other state-of-the-art multi-DNN accelerators, including PREMA, AI-MT, Planaria, and HDA. MDA accelerator achieves a speedup of 3.6, accompanied by substantial improvements of 7.3x, 12.7x, and 9.2x in energy efficiency, service-level agreement (SLA) satisfaction rate, and fairness, respectively.
引用
收藏
页码:46 / 58
页数:13
相关论文
共 50 条
  • [41] High-Performance and Energy-Efficient 3D Manycore GPU Architecture for Accelerating Graph Analytics
    Choudhury, Dwaipayan
    Rajam, Aravind Sukumaran
    Kalyanaraman, Ananth
    Pande, Partha Pratim
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (01)
  • [42] A novel high-performance and energy-efficient RRAM device with multi-functional conducting nanofilaments
    Wu, Min-Ci
    Chen, Jui-Yuan
    Ting, Yi-Hsin
    Huang, Chih-Yang
    Wu, Wen-Wei
    NANO ENERGY, 2021, 82
  • [43] MuDBN: An Energy-Efficient and High-Performance Multi-FPGA Accelerator for Deep Belief Networks
    Cheng, Yuming
    Wang, Chao
    Zhao, Yangyang
    Chen, Xianglan
    Zhou, Xuehai
    Li, Xi
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 435 - 438
  • [44] Energy-efficient, high-performance and memory efficient FIR adaptive filter architecture of wireless sensor networks for IoT applications
    Kumar, J. Charles Rajesh
    Kumar, D. Vinod
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2022, 47 (04):
  • [45] Energy-efficient, high-performance and memory efficient FIR adaptive filter architecture of wireless sensor networks for IoT applications
    J Charles Rajesh Kumar
    D Vinod Kumar
    Sādhanā, 47
  • [46] An Energy-efficient Reconfigurable Hybrid DNN Architecture for Speech Recognition with Approximate Computing
    Liu, Bo
    Guo, Shisheng
    Qin, Hai
    Gong, Yu
    Yang, Jinjiang
    Ge, Wei
    Yang, Jun
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [47] SPRINT: A High-Performance, Energy-Efficient, and Scalable Chiplet-Based Accelerator With Photonic Interconnects for CNN Inference
    Li, Yuan
    Louri, Ahmed
    Karanth, Avinash
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (10) : 2332 - 2345
  • [48] Aaron: Compile-time Kernel Adaptation for Multi-DNN Inference Acceleration on Edge GPU
    Zhao, Zhihe
    Ling, Neiwen
    Guan, Nan
    Xing, Guoliang
    PROCEEDINGS OF THE TWENTIETH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2022, 2022, : 802 - 803
  • [49] Zen: An Energy-Efficient High-Performance x86 Core
    Singh, Teja
    Schaefer, Alex
    Rangarajan, Sundar
    John, Deepesh
    Henrion, Carson
    Schreiber, Russell
    Rodriguez, Miguel
    Kosonocky, Stephen
    Naffziger, Samuel
    Novak, Amy
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (01) : 102 - 114
  • [50] Cooperative Partitioning: Energy-Efficient Cache Partitioning for High-Performance CMPs
    Sundararajan, Karthik T.
    Porpodas, Vasileios
    Jones, Timothy M.
    Topham, Nigel P.
    Franke, Bjoern
    2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 311 - 322