TRP: Trained Rank Pruning for Efficient Deep Neural Networks

被引:0
|
作者
Xu, Yuhui [1 ]
Li, Yuxi [1 ]
Zhang, Shuai [2 ]
Wen, Wei [3 ]
Wang, Botao [2 ]
Qi, Yingyong [2 ]
Chen, Yiran [3 ]
Lin, Weiyao [1 ]
Xiong, Hongkai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Qualcomm AI Res, San Diego, CA USA
[3] Duke Univ, Durham, NC 27706 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation.
引用
收藏
页码:977 / 983
页数:7
相关论文
共 50 条
  • [21] HeadStart: Enforcing Optimal Inceptions in Pruning Deep Neural Networks for Efficient Inference on GPGPUs
    Lin, Ning
    Lu, Hang
    Wei, Xin
    Li, Xiaowei
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [22] Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep Neural Networks
    Xu, Kaixin
    Wang, Zhe
    Geng, Xue
    Wu, Min
    Li, Xiaoli
    Lin, Weisi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17401 - 17411
  • [23] EasiEdge: A Novel Global Deep Neural Networks Pruning Method for Efficient Edge Computing
    Yu, Fang
    Cui, Li
    Wang, Pengcheng
    Han, Chuanqi
    Huang, Ruoran
    Huang, Xi
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (03): : 1259 - 1271
  • [24] Pruning deep convolutional neural networks for efficient edge computing in condition assessment of infrastructures
    Wu, Rih-Teng
    Singla, Ankush
    Jahanshahi, Mohammad R.
    Bertino, Elisa
    Ko, Bong Jun
    Verma, Dinesh
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2019, 34 (09) : 774 - 789
  • [25] A Novel Clustering-Based Filter Pruning Method for Efficient Deep Neural Networks
    Wei, Xiaohui
    Shen, Xiaoxian
    Zhou, Changbao
    Yue, Hengshan
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 245 - 258
  • [26] DEEP LEARNING BASED METHOD FOR PRUNING DEEP NEURAL NETWORKS
    Li, Lianqiang
    Zhu, Jie
    Sun, Ming-Ting
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 312 - 317
  • [27] Deep physical neural networks trained with backpropagation
    Wright, Logan G.
    Onodera, Tatsuhiro
    Stein, Martin M.
    Wang, Tianyu
    Schachter, Darren T.
    Hu, Zoey
    McMahon, Peter L.
    NATURE, 2022, 601 (7894) : 549 - +
  • [28] Deep physical neural networks trained with backpropagation
    Logan G. Wright
    Tatsuhiro Onodera
    Martin M. Stein
    Tianyu Wang
    Darren T. Schachter
    Zoey Hu
    Peter L. McMahon
    Nature, 2022, 601 : 549 - 555
  • [29] Anonymous Model Pruning for Compressing Deep Neural Networks
    Zhang, Lechun
    Chen, Guangyao
    Shi, Yemin
    Zhang, Quan
    Tan, Mingkui
    Wang, Yaowei
    Tian, Yonghong
    Huang, Tiejun
    THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 161 - 164
  • [30] A New Pruning Method to Train Deep Neural Networks
    Guo, Haonan
    Ren, Xudie
    Li, Shenghong
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2018, 423 : 767 - 775