Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

被引:44
|
作者
Yang, Dingqing [1 ]
Ghasemazar, Amin [1 ]
Ren, Xiaowei [1 ]
Golub, Maximilian [1 ,2 ]
Lemieux, Guy [1 ]
Lis, Mieszko [1 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Microsoft Corp, Redmond, WA 98052 USA
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/MICRO50266.2020.00064
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are optimized for the access patterns during inference, however, they do not efficiently support the emerging sparse training techniques. In this paper, we demonstrate (a) that accelerating sparse training requires a co-design approach where algorithms are adapted to suit the constraints of hardware, and (b) that hardware for sparse DNN training must tackle constraints that do not arise in inference accelerators. As proof of concept, we adapt a sparse training algorithm to be amenable to hardware acceleration; we then develop dataflow, data layout, and load-balancing techniques to accelerate it. The resulting system is a sparse DNN training accelerator that produces pruned models with the same accuracy as dense models without first training, then pruning, and finally retraining, a dense model. Compared to training the equivalent unpruned models using a state-of-the-art DNN accelerator without sparse training support, Procrustes consumes up to 3.26x less energy and offers up to 4x speedup across a range of models, while pruning weights by an order of magnitude and maintaining unpruned accuracy.
引用
收藏
页码:711 / 724
页数:14
相关论文
共 50 条
  • [1] Convolutional Neural Network Accelerator with Reconfigurable Dataflow
    Oh, Myungwoo
    Lee, Chaeeun
    Lee, Sanghun
    Seo, Youngho
    Kim, Sunwoo
    Wang, Jooho
    Park, Chester Sungchung
    2018 INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2018, : 42 - 43
  • [2] Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network
    Xiao, Hao
    Zhao, Kaikai
    Liu, Guangzhu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (05) : 772 - 775
  • [3] Deep Neural Network Training Accelerator Designs in ASIC and FPGA
    Venkataramanaiah, Shreyas K.
    Yin, Shihui
    Cao, Yu
    Seo, Jae-Sun
    2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 21 - 22
  • [4] Single-Channel Dataflow for Convolutional Neural Network Accelerator
    Li, Yihuang
    Ma, Sheng
    Guo, Yang
    Chen, Guilin
    Xu, Rui
    PROCEEDINGS OF 2018 IEEE 4TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2018), 2018, : 966 - 970
  • [5] TaxoNN: A Light-Weight Accelerator for Deep Neural Network Training
    Hojabr, Reza
    Givaki, Kamyar
    Pourahmadi, Kossar
    Nooralinejad, Parsa
    Khonsari, Ahmad
    Rahmati, Dara
    Najafi, M. Hassan
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [6] SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow
    Wang, Bo
    Ma, Sheng
    Luo, Shengbai
    Wu, Lizhou
    Zhang, Jianmin
    Zhang, Chunyuan
    Li, Tiejun
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (02)
  • [7] SADD: A Novel Systolic Array Accelerator with Dynamic Dataflow for Sparse GEMM in Deep Learning
    Wang, Bo
    Ma, Sheng
    Liu, Zhong
    Huang, Libo
    Yuan, Yuan
    Dai, Yi
    NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 42 - 53
  • [8] An Efficient Accelerator Unit for Sparse Convolutional Neural Network
    Zhao, Yulin
    Wang, Donghui
    Wang, Leiou
    TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018), 2018, 10806
  • [9] Energy and Bandwidth Efficient Sparse Programmable Dataflow Accelerator
    Schneider, Felix
    Karagounis, Michael
    Choubey, Bhaskar
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (09) : 4092 - 4105
  • [10] Deep Neural Network Accelerator based on FPGA
    Thang Viet Huynh
    2017 4TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2017, : 254 - 257