Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

被引:44
|
作者
Yang, Dingqing [1 ]
Ghasemazar, Amin [1 ]
Ren, Xiaowei [1 ]
Golub, Maximilian [1 ,2 ]
Lemieux, Guy [1 ]
Lis, Mieszko [1 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Microsoft Corp, Redmond, WA 98052 USA
来源
2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020) | 2020年
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/MICRO50266.2020.00064
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are optimized for the access patterns during inference, however, they do not efficiently support the emerging sparse training techniques. In this paper, we demonstrate (a) that accelerating sparse training requires a co-design approach where algorithms are adapted to suit the constraints of hardware, and (b) that hardware for sparse DNN training must tackle constraints that do not arise in inference accelerators. As proof of concept, we adapt a sparse training algorithm to be amenable to hardware acceleration; we then develop dataflow, data layout, and load-balancing techniques to accelerate it. The resulting system is a sparse DNN training accelerator that produces pruned models with the same accuracy as dense models without first training, then pruning, and finally retraining, a dense model. Compared to training the equivalent unpruned models using a state-of-the-art DNN accelerator without sparse training support, Procrustes consumes up to 3.26x less energy and offers up to 4x speedup across a range of models, while pruning weights by an order of magnitude and maintaining unpruned accuracy.
引用
收藏
页码:711 / 724
页数:14
相关论文
共 50 条
  • [31] Training Deep Belief Network with Sparse Hidden Units
    Hu, Zhen
    Hu, Wenzheng
    Zhang, Changshui
    PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 11 - 20
  • [32] An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Choi, Seungkyu
    Sim, Jaehyeong
    Kang, Myeonggu
    Choi, Yeongjae
    Kim, Hyeonuk
    Kim, Lee-Sup
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2691 - 2702
  • [33] A Two-way SRAM Array based Accelerator for Deep Neural Network On-chip Training
    Jiang, Hongwu
    Huang, Shanshi
    Peng, Xiaochen
    Su, Jian-Wei
    Chou, Yen-Chi
    Huang, Wei-Hsing
    Liu, Ta-Wei
    Liu, Ruhui
    Chang, Meng-Fan
    Yu, Shimeng
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [34] Deep Neural Network Acceleration With Sparse Prediction Layers
    Yao, Zhongtian
    Huang, Kejie
    Shen, Haibin
    Ming, Zhaoyan
    IEEE ACCESS, 2020, 8 (08): : 6839 - 6848
  • [35] Sparse Deep Neural Network Optimization for Embedded Intelligence
    Bi, Jia
    Gunn, Steve R.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (3-4)
  • [36] Sparse Deep Transfer Learning for Convolutional Neural Network
    Liu, Jiaming
    Wang, Yali
    Qiao, Yu
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2245 - 2251
  • [37] SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference
    Zhang, Jie-Fang
    Lee, Ching-En
    Liu, Chester
    Shao, Yakun Sophia
    Keckler, Stephen W.
    Zhang, Zhengya
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (02) : 636 - 647
  • [38] Neural Network Panning: Screening the Optimal Sparse Network Before Training
    Kang, Xiatao
    Li, Ping
    Yao, Jiayi
    Li, Chengxi
    COMPUTER VISION - ACCV 2022, PT I, 2023, 13841 : 602 - 617
  • [39] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
  • [40] Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators
    Heo, Jung Hwan
    Fayyazi, Arash
    Esmaili, Amirhossein
    Pedram, Massoud
    2022 ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2022, 2022,