BitPruner: Network Pruning for Bit-serial Accelerators

被引:21
|
作者
Zhao, Xiandong [1 ,2 ]
Wang, Ying [1 ,2 ,3 ]
Liu, Cheng [1 ]
Shi, Cong [4 ]
Tu, Kaijie [1 ]
Zhang, Lei [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] State Key Lab Comp Architecture, Beijing, Peoples R China
[4] Chongqing Univ, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
neural network; bit-serial accelerator; bit-pruning;
D O I
10.1109/dac18072.2020.9218534
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bit-serial architectures (BSAs) are becoming increasingly popular in low power neural network processor (NNP) design. However, the performance and efficiency of state-of-the-art BSA NNPs are heavily depending on the distribution of ineffectual weight-hits of the running neural network. To boost the efficiency of third-party BSA accelerators, this work presents Bit-Pruner, a software approach to learn BSA-favored neural networks without resorting to hardware modifications. The techniques proposed in this work not only progressively prune but also structure the non-zero bits in weights, so that the number of zero-hits in the model can be increased and also load balanced to suit the architecture of the target BSA accelerators. According to our experiments on a set of representative neural networks, Bit-Pruner increases the hit-sparsity up to 94.4% with negligible accuracy degradation. When the bit-pruned models are deployed onto typical BSA accelerators, the average performance is 2.1X and 1.5X higher than the baselines running non-pruned and weight-pruned networks, respectively.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Distance transform algorithm for bit-serial SIMD architectures
    Takala, JH
    Viitanen, JO
    COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 74 (02) : 150 - 161
  • [42] A BIT-SERIAL VLSI RECEPTIVE-FIELD ACCUMULATOR
    STROHBEHN, K
    ANDREOU, AG
    PROCEEDINGS OF THE IEEE 1989 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 1989, : 323 - 328
  • [43] HDL Based Implementation of NxN Bit-Serial Multiplier
    Akhter, Shamim
    Chaturvedi, Saurabh
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 470 - 474
  • [44] Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing
    Umuroglu, Yaman
    Conficconi, Davide
    Rasnayake, Lahiru
    Preusser, Thomas B.
    Sjalander, Magnus
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2019, 12 (03)
  • [45] A bit-serial systolic algorithm and VLSI implementation for RSA
    Zhang, CN
    Xu, Y
    Wu, CC
    1997 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2: PACRIM 10 YEARS - 1987-1997, 1997, : 523 - 526
  • [46] SORTING WITHOUT EXCHANGES ON A BIT-SERIAL SYSTOLIC ARRAY
    MEGSON, GM
    IEE PROCEEDINGS-G CIRCUITS DEVICES AND SYSTEMS, 1990, 137 (05): : 345 - 352
  • [47] BEHAVIORAL TO STRUCTURAL TRANSLATION IN A BIT-SERIAL SILICON COMPILER
    HARTLEY, RI
    JASICA, JR
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1988, 7 (08) : 877 - 886
  • [48] Bit-serial architecture for rank order and stack filters
    Hiasat, A
    Hasan, O
    INTEGRATION-THE VLSI JOURNAL, 2003, 36 (1-2) : 3 - 12
  • [49] An area-efficient bit-serial integer multiplier
    Schimmler, M
    Schmidt, B
    Lang, HW
    Heithecker, S
    VLSI'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VLSI, 2003, : 131 - 137
  • [50] BIT-SERIAL PARALLEL PROCESSOR FOR LU-DECOMPOSITION.
    Shimizu, Naohiko
    Tanaka, Mamoru
    Systems, computers, controls, 1983, 14 (06): : 67 - 77