Layer-Wise Data-Free CNN Compression

被引:1
|
作者
Horton, Maxwell [1 ]
Jin, Yanzi [1 ]
Farhadi, Ali [1 ]
Rastegari, Mohammad [1 ]
机构
[1] Apple, Cupertino, CA 95014 USA
关键词
D O I
10.1109/ICPR56361.2022.9956237
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a computationally efficient method for compressing a trained neural network without using real data. We break the problem of data-free network compression into independent layer-wise compressions. We show how to efficiently generate layer-wise training data using only a pretrained network. We use this data to perform independent layer-wise compressions on the pretrained network. We also show how to precondition the network to improve the accuracy of our layer-wise compression method. We present results for layer-wise compression using quantization and pruning. When quantizing, we compress with higher accuracy than related works while using orders of magnitude less compute. When compressing MobileNetV2 and evaluating on ImageNet, our method outperforms existing methods for quantization at all bit-widths, achieving a +0.34% improvement in 8-bit quantization, and a stronger improvement at lower bit-widths (up to a +28.50% improvement at 5 bits). When pruning, we outperform baselines of a similar compute envelope, achieving 1.5 times the sparsity rate at the same accuracy. We also show how to combine our efficient method with high-compute generative methods to improve upon their results.
引用
收藏
页码:2019 / 2026
页数:8
相关论文
共 50 条
  • [1] A Layer-Wise Extreme Network Compression for Super Resolution
    Hwang, Jiwon
    Uddin, A. F. M. Shahab
    Bae, Sung-Ho
    IEEE ACCESS, 2021, 9 : 93998 - 94009
  • [2] Layer-Wise Network Compression Using Gaussian Mixture Model
    Lee, Eunho
    Hwang, Youngbae
    ELECTRONICS, 2021, 10 (01) : 1 - 16
  • [3] LAD: Layer-Wise Adaptive Distillation for BERT Model Compression
    Lin, Ying-Jia
    Chen, Kuan-Yu
    Kao, Hung-Yu
    SENSORS, 2023, 23 (03)
  • [4] Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training
    Hascoet, Tristan
    Febvre, Quentin
    Zhuang, Weihao
    Ariki, Yasuo
    Takiguchi, Tetusya
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2049 - 2052
  • [5] Explaining CNN and RNN Using Selective Layer-Wise Relevance Propagation
    Jung, Yeon-Jee
    Han, Seung-Ho
    Choi, Ho-Jin
    IEEE ACCESS, 2021, 9 : 18670 - 18681
  • [6] Data-free pruning of CNN using kernel similarity
    Chen, Xinwang
    Ji, Fengrui
    Chu, Renxin
    Liu, Baolin
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [7] Data-Free Network Pruning for Model Compression
    Tang, Jialiang
    Liu, Mingjin
    Jiang, Ning
    Cai, Huan
    Yu, Wenxin
    Zhou, Jinjia
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [8] Live Demonstration: Layer-wise Configurable CNN Accelerator with High PE Utilization
    Park, Chunmyung
    Hyun, Eunjae
    Kim, Jicheon
    Nguyen, Xuan Truong
    Lee, Hyuk-Jae
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [9] LCRM: Layer-Wise Complexity Reduction Method for CNN Model Optimization on End Devices
    Hussain, Hanan
    Tamizharasan, P. S.
    Yadav, Praveen Kumar
    IEEE ACCESS, 2023, 11 : 66838 - 66857
  • [10] Dataflow optimization with layer-wise design variables estimation method for enflame CNN accelerators
    Chen, Tian
    Tan, Yu -an
    Zhang, Zheng
    Luo, Nan
    Li, Bin
    Li, Yuanzhang
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 189