Search-free Accelerator for Sparse Convolutional Neural Networks

被引:0
|
作者
Liu, Bosheng [1 ,2 ]
Chen, Xiaoming [1 ,2 ]
Han, Yinhe [1 ]
Wang, Ying [1 ]
Li, Jiajun [1 ,2 ]
Xu, Haobo [1 ,2 ]
Li, Xiaowei [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020 | 2020年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Sparse convolution neural networks; sparsity-aware CNN accelerator; internal interconnect; memory bandwidth;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sparsilication is an efficient solution to reduce the demand of on-chip memory space for deep convolutional neural networks (CNNs). Most of state-of-the-art CNN accelerators can deliver high throughput for sparse CNNs by searching pairs of nonzero weights and activations, and then sending them to processing elements (PEs) for multiplication-accumulation (MAC) operations. However, their PE scales are difficult to be increased for superior and efficient computing because of the significant internal interconnect and memory bandwidth consumption. To deal with this dilemma, we propose a sparsity aware architecture, called Swan, which frees the search process for sparse CNNs under limited interconnect and bandwidth resources. The architecture comprises two parts: a MAC unit that can free the search operation for the sparsity-aware MAC calculation, and a systolic compressive dataflow that well suits the MAC architecture and greatly reuses inputs for interconnect and bandwidth saving. With the proposed architecture, only one column of the PEs needs to load/store data while all PEs can operate in full scale. Evaluation results based on a place-and-route process show that the proposed design, in a compact factor of 4096 PEs, 4.9TOP/s peak performance, and 2.97W power running at 600MHz, achieves 1.5-2.1x speedup and 6.0-9.1x higher energy efficiency than state-of-the-art CNN accelerators with the same PE scale.
引用
收藏
页码:524 / 529
页数:6
相关论文
共 50 条
  • [1] Search-Free Inference Acceleration for Sparse Convolutional Neural Networks
    Liu, Bosheng
    Chen, Xiaoming
    Han, Yinhe
    Wu, Jigang
    Chang, Liang
    Liu, Peng
    Xu, Haobo
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (07) : 2156 - 2169
  • [2] An Efficient Accelerator for Sparse Convolutional Neural Networks
    You, Weijie
    Wu, Chang
    2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
  • [3] SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks
    Gondimalla, Ashish
    Chesnut, Noah
    Thottethodi, Mithuna
    Vijaykumar, T. N.
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 151 - 165
  • [4] SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
    Parashar, Angshuman
    Rhu, Minsoo
    Mukkara, Anurag
    Puglielli, Antonio
    Venkatesan, Rangharajan
    Khailany, Brucek
    Emer, Joel
    Keckler, Stephen W.
    Dally, William J.
    44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, : 27 - 40
  • [5] An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks
    Xie, Xiaoru
    Lin, Jun
    Wang, Zhongfeng
    Wei, Jinghe
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (07) : 2936 - 2949
  • [6] An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs
    Lu, Liqiang
    Xie, Jiaming
    Huang, Ruirui
    Zhang, Jiansong
    Lin, Wei
    Liang, Yun
    2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 17 - 25
  • [7] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
    Yin, Xiaodi
    Wu, Zhipeng
    Li, Dejian
    Shen, Chongfei
    Liu, Yu
    IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
  • [8] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
    Zhu, Chaoyang
    Huang, Kejie
    Yang, Shuyuan
    Zhu, Ziqi
    Zhang, Hejia
    Shen, Haibin
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965
  • [9] SpWA: An Efficient Sparse Winograd Convolutional Neural Networks Accelerator on FPGAs
    Lu, Liqiang
    Liang, Yun
    2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [10] Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
    Lai, Bo-Cheng
    Pan, Jyun-Wei
    Lin, Chien-Yu
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (05) : 1218 - 1222