TRP: Trained Rank Pruning for Efficient Deep Neural Networks

被引:0
|
作者
Xu, Yuhui [1 ]
Li, Yuxi [1 ]
Zhang, Shuai [2 ]
Wen, Wei [3 ]
Wang, Botao [2 ]
Qi, Yingyong [2 ]
Chen, Yiran [3 ]
Lin, Weiyao [1 ]
Xiong, Hongkai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Qualcomm AI Res, San Diego, CA USA
[3] Duke Univ, Durham, NC 27706 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation.
引用
收藏
页码:977 / 983
页数:7
相关论文
共 50 条
  • [31] Task dependent deep LDA pruning of neural networks
    Tian, Qing
    Arbel, Tal
    Clark, James J.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 203
  • [32] CUP: Cluster Pruning for Compressing Deep Neural Networks
    Duggal, Rahul
    Xiao, Cao
    Vuduc, Richard
    Duen Horng Chau
    Sun, Jimeng
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5102 - 5106
  • [33] Pruning Deep Neural Networks by Optimal Brain Damage
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1092 - 1095
  • [34] Structured Pruning for Deep Convolutional Neural Networks: A Survey
    He, Yang
    Xiao, Lingao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2900 - 2919
  • [35] Class-dependent Pruning of Deep Neural Networks
    Entezari, Rahim
    Saukh, Olga
    2020 IEEE SECOND WORKSHOP ON MACHINE LEARNING ON EDGE IN SENSOR SYSTEMS (SENSYS-ML 2020), 2020, : 13 - 18
  • [36] On the Information of Feature Maps and Pruning of Deep Neural Networks
    Soltani, Mohammadreza
    Wu, Suya
    Ding, Jie
    Ravier, Robert
    Tarokh, Vahid
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6988 - 6995
  • [37] Conditional Automated Channel Pruning for Deep Neural Networks
    Liu, Yixin
    Guo, Yong
    Guo, Jiaxin
    Jiang, Luoqian
    Chen, Jian
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1275 - 1279
  • [38] Self-distilled Pruning of Deep Neural Networks
    Neill, James O'
    Dutta, Sourav
    Assem, Haytham
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT II, 2023, 13714 : 655 - 670
  • [39] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406
  • [40] Structured Pruning of RRAM Crossbars for Efficient In-Memory Computing Acceleration of Deep Neural Networks
    Meng, Jian
    Yang, Li
    Peng, Xiaochen
    Yu, Shimeng
    Fan, Deliang
    Seo, Jae-Sun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (05) : 1576 - 1580