TNSS:Two-Nibble Sparsity-Aware Stride Decomposing Acceleration for Convolutional Neural Networks

被引：0

作者：

Huang, Yun-Yin ^{[1
]}

Chen, Yu-Guang ^{[1
]}

Jou, Jing-Yang ^{[1
]}

机构：

[1] Natl Cent Univ, Dept Elect Engn, Taoyuan, Taiwan

来源：

2024 IEEE THE 20TH ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS 2024 | 2024年

关键词：

convolution neural networks; data-level sparsity; bit-level sparsity; data compression; stride-decompose;

D O I：

10.1109/APCCAS62602.2024.10808528

中图分类号：

学科分类号：

摘要：

Convolution Neural Networks (CNNs) are effective in image processing but suffer resource wastage due to sparsity in feature maps and weights, leading to inefficient computations. Compression of sparse data results in irregularity and challenges with feature map-weight matching for MAC operation due to different convolution strides. To address the above problems, a Two-Nibble Sparsity-Aware Stride-Decomposing (TNSS) scheme is proposed in this paper to efficiently eliminate zero-value computations in non-unit stride scenarios, while simultaneously considering bit-level sparsity to enhance inference efficiency further. In TNSS, tensors are initially decomposed into several unit stride tensors. Following this, TNSS compresses the feature map using two-nibble representation, reducing data size and computation load. Experimental results show that TNSS achieves an average speedup of 8.2x, and 1.36x on VGG16 and MobileNetV1 compared with conventional architecture and the recent accelerator StarSPA, respectively.

引用

页码：795 / 799

页数：5

共 20 条

[1] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
Haiying Yuan
Zhiyong Zeng
Junpeng Cheng
Minghao Li
Circuits, Systems, and Signal Processing, 2022, 41 : 4370 - 4389
[2] A Flexible Sparsity-Aware Accelerator with High Sensitivity and Efficient Operation for Convolutional Neural Networks
Yuan, Haiying
Zeng, Zhiyong
Cheng, Junpeng
Li, Minghao
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (08) : 4370 - 4389
[3] A Sparsity-Aware Convolutional Neural Network Accelerator with Flexible Parallelism
Yuan H.-Y.
Zeng Z.-Y.
Cheng J.-P.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (08): : 1811 - 1818
[4] Sparsity-Aware Caches to Accelerate Deep Neural Networks
Ganesan, Vinod
Sen, Sanchari
Kumar, Pratyush
Gala, Neel
Veezhinathan, Kamakoti
Raghunathan, Anand
PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 85 - 90
[5] Sparsity-aware generalization theory for deep neural networks
Muthukumar, Ramchandran
Sulam, Jeremias
THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
[6] Sparsity-Aware Orthogonal Initialization of Deep Neural Networks
Esguerra, Kiara
Nasir, Muneeb
Tang, Tong Boon
Tumian, Afidalina
Ho, Eric Tatt Wei
IEEE ACCESS, 2023, 11 : 74165 - 74181
[7] SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks
Yin, Ruokai
Moitra, Abhishek
Bhattacharjee, Abhiroop
Kim, Youngeun
Panda, Priyadarshini
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (06) : 1926 - 1938
[8] Quantization and sparsity-aware processing for energy-efficient NVM-based convolutional neural networks
Bao, Han
Qin, Yifan
Chen, Jia
Yang, Ling
Li, Jiancong
Zhou, Houji
Li, Yi
Miao, Xiangshui
FRONTIERS IN ELECTRONICS, 2022, 3
[9] A Convolutional Spiking Neural Network Accelerator with the Sparsity-aware Memory and Compressed Weights
Liu, Hanqing
Cui, Xiaole
Zhang, Sunrui
Yin, Mingqi
Jiang, Yuanyuan
Cui, Xiaoxin
2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 163 - 171
[10] Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks
Kim, Soojeong
Yu, Gyeong-In
Park, Hojin
Cho, Sungwoo
Jeong, Eunji
Ha, Hyeonmin
Lee, Sanha
Jeong, Joo Seong
Chun, Byung-Gon
PROCEEDINGS OF THE FOURTEENTH EUROSYS CONFERENCE 2019 (EUROSYS '19), 2019,

← 1 2 →