TRP: Trained Rank Pruning for Efficient Deep Neural Networks

被引:0
|
作者
Xu, Yuhui [1 ]
Li, Yuxi [1 ]
Zhang, Shuai [2 ]
Wen, Wei [3 ]
Wang, Botao [2 ]
Qi, Yingyong [2 ]
Chen, Yiran [3 ]
Lin, Weiyao [1 ]
Xiong, Hongkai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Qualcomm AI Res, San Diego, CA USA
[3] Duke Univ, Durham, NC 27706 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation.
引用
收藏
页码:977 / 983
页数:7
相关论文
共 50 条
  • [41] Fast and Efficient Deep Sparse Multi-Strength Spiking Neural Networks with Dynamic Pruning
    Chen, Ruizhi
    Ma, Hong
    Xie, Shaolin
    Guo, Peng
    Li, Pin
    Wang, Donglin
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 412 - 419
  • [42] MIXP: Efficient Deep Neural Networks Pruning for Further FLOPs Compression via Neuron Bond
    Hu, Bin
    Zhao, Tianming
    Xie, Yucheng
    Wang, Yan
    Guo, Xiaonan
    Cheng, Jerry
    Chen, Yingying
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [43] Pruning feature maps for efficient convolutional neural networks
    Guo, Xiao-ting
    Xie, Xin-shu
    Lang, Xun
    OPTIK, 2023, 281
  • [44] Class-Aware Pruning for Efficient Neural Networks
    Jiang, Mengnan
    Wang, Jingcun
    Eldebiky, Amro
    Yin, Xunzhao
    Zhuo, Cheng
    Lin, Ing-Chao
    Li Zhang, Grace
    2024 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2024,
  • [45] An Efficient End-to-End Channel Level Pruning Method for Deep Neural Networks Compression
    Zeng, Lei
    Chen, Shi
    Zeng, Sen
    PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 43 - 46
  • [46] ACP: ADAPTIVE CHANNEL PRUNING FOR EFFICIENT NEURAL NETWORKS
    Zhang, Yuan
    Yuan, Yuan
    Wang, Qi
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4488 - 4492
  • [47] QLP: Deep Q-Learning for Pruning Deep Neural Networks
    Camci, Efe
    Gupta, Manas
    Wu, Min
    Lin, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6488 - 6501
  • [48] Distilled Neural Networks for Efficient Learning to Rank
    Nardini, Franco Maria
    Rulli, Cosimo
    Trani, Salvatore
    Venturini, Rossano
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4695 - 4712
  • [49] Neural Dynamics Pruning for Energy-Efficient Spiking Neural Networks
    Huang, Haoyu
    He, Linxuan
    Liu, Faqiang
    Zhao, Rong
    Shi, Luping
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [50] Colour Visual Coding in trained Deep Neural Networks
    Rafegas, Ivet
    Vanrell, Maria
    PERCEPTION, 2016, 45 : 214 - 214