Search-free Accelerator for Sparse Convolutional Neural Networks

被引：0

作者：

Liu, Bosheng ^{[1
,2
]}

Chen, Xiaoming ^{[1
,2
]}

Han, Yinhe ^{[1
]}

Wang, Ying ^{[1
]}

Li, Jiajun ^{[1
,2
]}

Xu, Haobo ^{[1
,2
]}

Li, Xiaowei ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

2020 25TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2020 | 2020年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Sparse convolution neural networks; sparsity-aware CNN accelerator; internal interconnect; memory bandwidth;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sparsilication is an efficient solution to reduce the demand of on-chip memory space for deep convolutional neural networks (CNNs). Most of state-of-the-art CNN accelerators can deliver high throughput for sparse CNNs by searching pairs of nonzero weights and activations, and then sending them to processing elements (PEs) for multiplication-accumulation (MAC) operations. However, their PE scales are difficult to be increased for superior and efficient computing because of the significant internal interconnect and memory bandwidth consumption. To deal with this dilemma, we propose a sparsity aware architecture, called Swan, which frees the search process for sparse CNNs under limited interconnect and bandwidth resources. The architecture comprises two parts: a MAC unit that can free the search operation for the sparsity-aware MAC calculation, and a systolic compressive dataflow that well suits the MAC architecture and greatly reuses inputs for interconnect and bandwidth saving. With the proposed architecture, only one column of the PEs needs to load/store data while all PEs can operate in full scale. Evaluation results based on a place-and-route process show that the proposed design, in a compact factor of 4096 PEs, 4.9TOP/s peak performance, and 2.97W power running at 600MHz, achieves 1.5-2.1x speedup and 6.0-9.1x higher energy efficiency than state-of-the-art CNN accelerators with the same PE scale.

引用

页码：524 / 529

页数：6

共 50 条

[1] Search-Free Inference Acceleration for Sparse Convolutional Neural Networks
Liu, Bosheng
Chen, Xiaoming
Han, Yinhe
Wu, Jigang
Chang, Liang
Liu, Peng
Xu, Haobo
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (07) : 2156 - 2169
[2] An Efficient Accelerator for Sparse Convolutional Neural Networks
You, Weijie
Wu, Chang
2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
[3] SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks
Gondimalla, Ashish
Chesnut, Noah
Thottethodi, Mithuna
Vijaykumar, T. N.
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 151 - 165
[4] SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks
Parashar, Angshuman
Rhu, Minsoo
Mukkara, Anurag
Puglielli, Antonio
Venkatesan, Rangharajan
Khailany, Brucek
Emer, Joel
Keckler, Stephen W.
Dally, William J.
44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, : 27 - 40
[5] An Efficient and Flexible Accelerator Design for Sparse Convolutional Neural Networks
Xie, Xiaoru
Lin, Jun
Wang, Zhongfeng
Wei, Jinghe
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (07) : 2936 - 2949
[6] An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs
Lu, Liqiang
Xie, Jiaming
Huang, Ruirui
Zhang, Jiansong
Lin, Wei
Liang, Yun
2019 27TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2019, : 17 - 25
[7] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
Yin, Xiaodi
Wu, Zhipeng
Li, Dejian
Shen, Chongfei
Liu, Yu
IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
[8] An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
Zhu, Chaoyang
Huang, Kejie
Yang, Shuyuan
Zhu, Ziqi
Zhang, Hejia
Shen, Haibin
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (09) : 1953 - 1965
[9] SpWA: An Efficient Sparse Winograd Convolutional Neural Networks Accelerator on FPGAs
Lu, Liqiang
Liang, Yun
2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
[10] Enhancing Utilization of SIMD-Like Accelerator for Sparse Convolutional Neural Networks
Lai, Bo-Cheng
Pan, Jyun-Wei
Lin, Chien-Yu
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (05) : 1218 - 1222

← 1 2 3 4 5 →