Layer-Wise Data-Free CNN Compression

被引：1

作者：

Horton, Maxwell ^{[1
]}

Jin, Yanzi ^{[1
]}

Farhadi, Ali ^{[1
]}

Rastegari, Mohammad ^{[1
]}

机构：

[1] Apple, Cupertino, CA 95014 USA

来源：

2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2022年

关键词：

D O I：

10.1109/ICPR56361.2022.9956237

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a computationally efficient method for compressing a trained neural network without using real data. We break the problem of data-free network compression into independent layer-wise compressions. We show how to efficiently generate layer-wise training data using only a pretrained network. We use this data to perform independent layer-wise compressions on the pretrained network. We also show how to precondition the network to improve the accuracy of our layer-wise compression method. We present results for layer-wise compression using quantization and pruning. When quantizing, we compress with higher accuracy than related works while using orders of magnitude less compute. When compressing MobileNetV2 and evaluating on ImageNet, our method outperforms existing methods for quantization at all bit-widths, achieving a +0.34% improvement in 8-bit quantization, and a stronger improvement at lower bit-widths (up to a +28.50% improvement at 5 bits). When pruning, we outperform baselines of a similar compute envelope, achieving 1.5 times the sparsity rate at the same accuracy. We also show how to combine our efficient method with high-compute generative methods to improve upon their results.

引用

页码：2019 / 2026

页数：8

共 50 条

[1] A Layer-Wise Extreme Network Compression for Super Resolution
Hwang, Jiwon
Uddin, A. F. M. Shahab
Bae, Sung-Ho
IEEE ACCESS, 2021, 9 : 93998 - 94009
[2] Layer-Wise Network Compression Using Gaussian Mixture Model
Lee, Eunho
Hwang, Youngbae
ELECTRONICS, 2021, 10 (01) : 1 - 16
[3] LAD: Layer-Wise Adaptive Distillation for BERT Model Compression
Lin, Ying-Jia
Chen, Kuan-Yu
Kao, Hung-Yu
SENSORS, 2023, 23 (03)
[4] Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training
Hascoet, Tristan
Febvre, Quentin
Zhuang, Weihao
Ariki, Yasuo
Takiguchi, Tetusya
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2049 - 2052
[5] Explaining CNN and RNN Using Selective Layer-Wise Relevance Propagation
Jung, Yeon-Jee
Han, Seung-Ho
Choi, Ho-Jin
IEEE ACCESS, 2021, 9 : 18670 - 18681
[6] Data-free pruning of CNN using kernel similarity
Chen, Xinwang
Ji, Fengrui
Chu, Renxin
Liu, Baolin
MULTIMEDIA SYSTEMS, 2025, 31 (02)
[7] Data-Free Network Pruning for Model Compression
Tang, Jialiang
Liu, Mingjin
Jiang, Ning
Cai, Huan
Yu, Wenxin
Zhou, Jinjia
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
[8] Live Demonstration: Layer-wise Configurable CNN Accelerator with High PE Utilization
Park, Chunmyung
Hyun, Eunjae
Kim, Jicheon
Nguyen, Xuan Truong
Lee, Hyuk-Jae
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[9] LCRM: Layer-Wise Complexity Reduction Method for CNN Model Optimization on End Devices
Hussain, Hanan
Tamizharasan, P. S.
Yadav, Praveen Kumar
IEEE ACCESS, 2023, 11 : 66838 - 66857
[10] Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators
Chen, Tian
Tan, Yu -an
Zhang, Zheng
Luo, Nan
Li, Bin
Li, Yuanzhang
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 189

← 1 2 3 4 5 →