Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

被引：44

作者：

Yang, Dingqing ^{[1
]}

Ghasemazar, Amin ^{[1
]}

Ren, Xiaowei ^{[1
]}

Golub, Maximilian ^{[1
,2
]}

Lemieux, Guy ^{[1
]}

Lis, Mieszko ^{[1
]}

机构：

[1] Univ British Columbia, Vancouver, BC, Canada

[2] Microsoft Corp, Redmond, WA 98052 USA

来源：

2020 53RD ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 2020) | 2020年

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

10.1109/MICRO50266.2020.00064

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The success of DNN pruning has led to the development of energy-efficient inference accelerators that support pruned models with sparse weight and activation tensors. Because the memory layouts and dataflows in these architectures are optimized for the access patterns during inference, however, they do not efficiently support the emerging sparse training techniques. In this paper, we demonstrate (a) that accelerating sparse training requires a co-design approach where algorithms are adapted to suit the constraints of hardware, and (b) that hardware for sparse DNN training must tackle constraints that do not arise in inference accelerators. As proof of concept, we adapt a sparse training algorithm to be amenable to hardware acceleration; we then develop dataflow, data layout, and load-balancing techniques to accelerate it. The resulting system is a sparse DNN training accelerator that produces pruned models with the same accuracy as dense models without first training, then pruning, and finally retraining, a dense model. Compared to training the equivalent unpruned models using a state-of-the-art DNN accelerator without sparse training support, Procrustes consumes up to 3.26x less energy and offers up to 4x speedup across a range of models, while pruning weights by an order of magnitude and maintaining unpruned accuracy.

引用

页码：711 / 724

页数：14

共 50 条

[21] An FPGA Accelerator for Spiking Neural Network Simulation and Training
Sakellariou, Vasilis
Paliouras, Vassilis
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[22] A Visual Tracking Deep Convolutional Neural Network Accelerator
Qin, Zhiyong
Yu, Lixin
PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING (AMCCE 2017), 2017, 118 : 493 - 499
[23] Improving Hardware Efficiency of a Sparse Training Accelerator by Restructuring a Reduction Network
Shin, Banseok
Park, Sehun
Kung, Jaeha
2023 21ST IEEE INTERREGIONAL NEWCAS CONFERENCE, NEWCAS, 2023,
[24] Performance of Training Sparse Deep Neural Networks on GPUs
Wang, Jianzong
Huang, Zhangcheng
Kong, Lingwei
Xiao, Jing
Wang, Pengyu
Zhang, Lu
Li, Chao
2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[25] Acceleration of Sparse Convolutional Neural Network Based on Coarse-Grained Dataflow Architecture
Wu X.
Ou Y.
Li W.
Wang D.
Zhang H.
Fan D.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (07): : 1504 - 1517
[26] Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training
Wiedemann, Simon
Mehari, Temesgen
Kepp, Kevin
Samek, Wojciech
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3096 - 3104
[27] ESSA: Design of a Programmable Efficient Sparse Spiking Neural Network Accelerator
Kuang, Yisong
Cui, Xiaoxin
Wang, Zilin
Zou, Chenglong
Zhong, Yi
Liu, Kefei
Dai, Zhenhui
Yu, Dunshan
Wang, Yuan
Huang, Ru
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2022, 30 (11) : 1631 - 1641
[28] FPGA Accelerator for Homomorphic Encrypted Sparse Convolutional Neural Network Inference
Yang, Yang
Kuppannagari, Sanmukh R.
Kannan, Rajgopal
Prasanna, Viktor K.
2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022), 2022, : 81 - 89
[29] Visualization in Deep Neural Network Training
Kollias, Stefanos
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2022, 31 (03)
[30] Activation in Network for NoC-based Deep Neural Network Accelerator
Zhu, Wenyao
Chen, Yizhi
Lu, Zhonghai
2024 INTERNATIONAL VLSI SYMPOSIUM ON TECHNOLOGY, SYSTEMS AND APPLICATIONS, VLSI TSA, 2024,

← 1 2 3 4 5 →