Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

被引:44
|
作者
Yang, Dingqing [1 ]
Ghasemazar, Amin [1 ]
Ren, Xiaowei [1 ]
Golub, Maximilian [1 ,2 ]
Lemieux, Guy [1 ]
Lis, Mieszko [1 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Microsoft Corp, Redmond, WA 98052 USA
来源
2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020) | 2020年
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/MICRO50266.2020.00064
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are optimized for the access patterns during inference, however, they do not efficiently support the emerging sparse training techniques. In this paper, we demonstrate (a) that accelerating sparse training requires a co-design approach where algorithms are adapted to suit the constraints of hardware, and (b) that hardware for sparse DNN training must tackle constraints that do not arise in inference accelerators. As proof of concept, we adapt a sparse training algorithm to be amenable to hardware acceleration; we then develop dataflow, data layout, and load-balancing techniques to accelerate it. The resulting system is a sparse DNN training accelerator that produces pruned models with the same accuracy as dense models without first training, then pruning, and finally retraining, a dense model. Compared to training the equivalent unpruned models using a state-of-the-art DNN accelerator without sparse training support, Procrustes consumes up to 3.26x less energy and offers up to 4x speedup across a range of models, while pruning weights by an order of magnitude and maintaining unpruned accuracy.
引用
收藏
页码:711 / 724
页数:14
相关论文
共 50 条
  • [41] The Design and Implementation of Scalable Deep Neural Network Accelerator Cores
    Sakamoto, Ryuichi
    Takata, Ryo
    Ishii, Jun
    Kondo, Masaaki
    Nakamura, Hiroshi
    Ohkubo, Tetsui
    Kojima, Takuya
    Amano, Hideharu
    2017 IEEE 11TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP (MCSOC 2017), 2017, : 13 - 20
  • [42] An Energy-Efficient Deep Neural Network Accelerator Design
    Jung, Jueun
    Lee, Kyuho Jason
    2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 272 - 276
  • [43] An Optical Accelerator for Deep Neural Network Based on Integrated Nanophotonics
    Shiomi, Jun
    Ishihara, Tohru
    Onodera, Hidetoshi
    Shinya, Akihiko
    Notomi, Masaya
    2020 INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC 2020), 2020, : 95 - 101
  • [44] Analog In-Memory Subthreshold Deep Neural Network Accelerator
    Fick, L.
    Blaauw, D.
    Sylvester, D.
    Skrzyniarz, S.
    Parikh, M.
    Fick, D.
    2017 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE (CICC), 2017,
  • [45] A Deep Neural Network Accelerator Based on Tiled RRAM Architecture
    Wang, Qiwen
    Wang, Xinxin
    Lee, Seung Hwan
    Meng, Fan-Hsuan
    Lu, Wei D.
    2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2019,
  • [46] RazorNet: Adversarial Training and Noise Training on a Deep Neural Network Fooled by a Shallow Neural Network
    Taheri, Shayan
    Salem, Milad
    Yuan, Jiann-Shiun
    BIG DATA AND COGNITIVE COMPUTING, 2019, 3 (03) : 1 - 17
  • [47] FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks
    Lu, Wenyan
    Yan, Guihai
    Li, Jiajun
    Gong, Shijun
    Han, Yinhe
    Li, Xiaowei
    2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2017, : 553 - 564
  • [48] Multiply-and-Fire: An Event-Driven Sparse Neural Network Accelerator
    Yu, Miao
    Xiang, Tingting
    Miriyala, Venkata Pavan Kumar
    Carlson, Trevor E.
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2023, 20 (04)
  • [49] Sparse-Sparse Matrix Multiplication Accelerator on FPGA featuring Distribute-Merge Product Dataflow
    Nagahara, Yuta
    Yan, Jiale
    Kawamura, Kazushi
    Motomura, Masato
    Chu, Thiem Van
    29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 785 - 791
  • [50] A power-efficient spiking convolutional neural network accelerator based on temporal parallelism and streaming dataflow
    Zhang, Jian
    Wang, Yong
    Zhang, Yanlong
    Bi, Bo
    Chen, Qiliang
    Cai, Yimao
    MICROELECTRONICS JOURNAL, 2025, 158