Hardware Accelerator Design for Sparse DNN Inference and Training: A Tutorial

被引:3
|
作者
Mao, Wendong [1 ]
Wang, Meiqi [1 ]
Xie, Xiaoru [2 ]
Wu, Xiao [2 ]
Wang, Zhongfeng [1 ]
机构
[1] Sun Yat Sen Univ, Sch Integrated Circuits, Shenzhen Campus, Shenzhen 518107, Guangdong, Peoples R China
[2] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210008, Peoples R China
关键词
Hardware acceleration; sparsity; CNN; transformer; tutorial; deep learning; FLEXIBLE ACCELERATOR; NEURAL-NETWORKS; EFFICIENT; ARCHITECTURE;
D O I
10.1109/TCSII.2023.3344681
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks (DNNs) are widely used in many fields, such as artificial intelligence generated content (AIGC) and robotics. To efficiently support these tasks, the model pruning technique is developed to compress the computational and memory-intensive DNNs. However, directly executing these sparse models on a common hardware accelerator can cause significant under-utilization, since invalid data resulting from the sparse patterns leads to unnecessary computations and irregular memory accesses. This brief analyzes the critical issues in accelerating sparse models, and provides an overview of typical hardware designs for various sparse DNNs, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), and Transformers. Following the overview, we give a practical guideline of designing efficient accelerators for sparse DNNs with qualitative metrics to evaluate hardware overhead under different cases. In addition, we highlight potential opportunities in terms of hardware/software/algorithm co-optimizations from the perspective of sparse DNN implementation, and provide insights into recent design trends for the efficient implementation of transformers with sparse attention, which facilitates large language model (LLM) deployments with high throughput and energy efficiency.
引用
收藏
页码:1708 / 1714
页数:7
相关论文
共 50 条
  • [21] Hardware Design of Cryptographic Accelerator
    Hulic, Michal
    Vokorokos, Liberios
    Adam, Norbert
    Fecil'ak, Peter
    2018 IEEE 16TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2018): DEDICATED TO THE MEMORY OF PIONEER OF ROBOTICS ANTAL (TONY) K. BEJCZY, 2018, : 201 - 206
  • [22] Data Flow Mapping onto DNN Accelerator Considering Hardware Cost
    Parchamdar, Baharealsadat
    Reshadi, Midia
    PROCEEDINGS OF THE 2020 IEEE DALLAS CIRCUITS AND SYSTEMS CONFERENCE (DCAS 2020), 2020,
  • [23] TUTORIAL - DESIGN PARAMETERS OF TOMOGRAPHIC HARDWARE
    MORRIS, R
    MATERIALS EVALUATION, 1985, 43 (02) : 11 - 11
  • [24] An Efficient Sparse-Aware Summation Optimization Strategy for DNN Accelerator
    Zhang, Danqing
    Li, Baiting
    Wang, Hang
    Zhang, Xuchong
    Sun, Hongbin
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [25] AUTOHET: An Automated Heterogeneous ReRAM-Based Accelerator for DNN Inference
    Wu, Tong
    Het, Shuibing
    Zhu, Jianxin
    Chen, Weijian
    Yang, Siling
    Chen, Ping
    Yin, Yanlong
    Zhang, Xuechen
    Sun, Xian-He
    Chen, Gang
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 1052 - 1061
  • [26] An Efficient Hardware Accelerator for Sparse Transformer Neural Networks
    Fang, Chao
    Guo, Shouliang
    Wu, Wei
    Lin, Jun
    Wang, Zhongfeng
    Hsu, Ming Kai
    Liu, Lingzhi
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2670 - 2674
  • [27] Custom Hardware Inference Accelerator for TensorFlow Lite for Microcontrollers
    Manor, Erez
    Greenberg, Shlomo
    IEEE ACCESS, 2022, 10 : 73484 - 73493
  • [28] RAMAN: A Reconfigurable and Sparse tinyML Accelerator for Inference on Edge
    Krishna, Adithya
    Rohit Nudurupati, Srikanth
    Chandana, D. G.
    Dwivedi, Pritesh
    van Schaik, Andre
    Mehendale, Mahesh
    Thakur, Chetan Singh
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (14): : 24831 - 24845
  • [29] Late Breaking Results: Hardware-Efficient Stochastic Rounding Unit Design for DNN Training
    Chang, Sung-En
    Yuan, Geng
    Lu, Alec
    Sun, Mengshu
    Li, Yanyu
    Ma, Xiaolong
    Li, Zhengang
    Xie, Yanyue
    Qin, Minghai
    Lin, Xue
    Fang, Zhenman
    Wang, Yanzhi
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1396 - 1397
  • [30] Sparse-T: Hardware accelerator thread for unstructured sparse data processing
    Vasireddy, Pranathi
    Kavi, Krishna
    Mehta, Gayatri
    2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,