BitPruner: Network Pruning for Bit-serial Accelerators

被引：21

作者：

Zhao, Xiandong ^{[1
,2
]}

Wang, Ying ^{[1
,2
,3
]}

Liu, Cheng ^{[1
]}

Shi, Cong ^{[4
]}

Tu, Kaijie ^{[1
]}

Zhang, Lei ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] State Key Lab Comp Architecture, Beijing, Peoples R China

[4] Chongqing Univ, Chongqing, Peoples R China

来源：

PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC) | 2020年

基金：

中国国家自然科学基金;

关键词：

neural network; bit-serial accelerator; bit-pruning;

D O I：

10.1109/dac18072.2020.9218534

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Bit-serial architectures (BSAs) are becoming increasingly popular in low power neural network processor (NNP) design. However, the performance and efficiency of state-of-the-art BSA NNPs are heavily depending on the distribution of ineffectual weight-hits of the running neural network. To boost the efficiency of third-party BSA accelerators, this work presents Bit-Pruner, a software approach to learn BSA-favored neural networks without resorting to hardware modifications. The techniques proposed in this work not only progressively prune but also structure the non-zero bits in weights, so that the number of zero-hits in the model can be increased and also load balanced to suit the architecture of the target BSA accelerators. According to our experiments on a set of representative neural networks, Bit-Pruner increases the hit-sparsity up to 94.4% with negligible accuracy degradation. When the bit-pruned models are deployed onto typical BSA accelerators, the average performance is 2.1X and 1.5X higher than the baselines running non-pruned and weight-pruned networks, respectively.

引用

页数：6

共 50 条

[1] Network Pruning for Bit-Serial Accelerators
Zhao, Xiandong
Wang, Ying
Liu, Cheng
Shi, Cong
Tu, Kaijie
Zhang, Lei
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (05) : 1597 - 1609
[2] ON A BIT-SERIAL INPUT AND BIT-SERIAL OUTPUT MULTIPLIER
GNANASEKARAN, R
IEEE TRANSACTIONS ON COMPUTERS, 1983, 32 (09) : 878 - 880
[3] Bit-Serial Cache: Exploiting Input Bit Vector Repetition to Accelerate Bit-Serial Inference
Lo, Yun-Chen
Liu, Ren-Shuo
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
[4] Stripes: Bit-Serial Deep Neural Network Computing
Judd, Patrick
Albericio, Jorge
Hetherington, Tayler
Aamodt, Tor M.
Moshovos, Andreas
2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016,
[5] Stripes: Bit-Serial Deep Neural Network Computing
Judd, Patrick
Albericio, Jorge
Moshovos, Andreas
IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 80 - 83
[6] BitCluster: Fine-Grained Weight Quantization for Load-Balanced Bit-Serial Neural Network Accelerators
Li, Ang
Mo, Huiyu
Zhu, Wenping
Li, Qiang
Yin, Shouyi
Wei, Shaojun
Liu, Leibo
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4747 - 4757
[7] BIT-SERIAL MULTIPLIERS AND SQUARERS
IENNE, P
VIREDAZ, MA
IEEE TRANSACTIONS ON COMPUTERS, 1994, 43 (12) : 1445 - 1450
[8] On Bit-Serial NoCs for FPGAs
Kapre, Nachiket
2017 IEEE 25TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2017), 2017, : 32 - 39
[9] BIT-SERIAL MODULAR MULTIPLIER
TOMLINSON, A
ELECTRONICS LETTERS, 1989, 25 (24) : 1664 - 1664
[10] A COMPARISON OF BIT-SERIAL AND BIT PARALLEL DCT DESIGNS
CROOK, D
FULCHER, J
VLSI DESIGN, 1995, 3 (01) : 59 - 65

← 1 2 3 4 5 →