Efficient Low-Dimensional Compression of Overparameterized Models

被引:0
|
作者
Kwon, Soo Min [1 ]
Zhang, Zekai [2 ]
Song, Dogyoon [1 ]
Balzano, Laura [1 ]
Qu, Qing [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Tsinghua Univ, Beijing, Peoples R China
关键词
RANK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present a novel approach for compressing overparameterized models, developed through studying their learning dynamics. We observe that for many deep models, updates to the weight matrices occur within a low-dimensional invariant subspace. For deep linear models, we demonstrate that their principal components are fitted incrementally within a small subspace, and use these insights to propose a compression algorithm for deep linear networks that involve decreasing the width of their intermediate layers. We empirically evaluate the effectiveness of our compression technique on matrix recovery problems. Remarkably, by using an initialization that exploits the structure of the problem, we observe that our compressed network converges faster than the original network, consistently yielding smaller recovery errors. We substantiate this observation by developing a theory focused on deep matrix factorization. Finally, we empirically demonstrate how our compressed model has the potential to improve the utility of deep nonlinear models. Overall, our algorithm improves the training efficiency by more than 2x, without compromising generalization.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Models for low-dimensional thermoelectricity
    Koga, T.
    Sun, X.
    Cronin, S.B.
    Dresselhaus, M.S.
    Wang, K.L.
    Chen, G.
    Journal of Computer-Aided Materials Design, 1998, 4 (03): : 175 - 182
  • [2] Learning Low-Dimensional Models of Microscopes
    Debarnot, Valentin
    Escande, Paul
    Mangeat, Thomas
    Weiss, Pierre
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2021, 7 (07) : 178 - 190
  • [3] Low-dimensional supersymmetric lattice models
    Bergner, G.
    Kaestner, T.
    Uhlmann, S.
    Wipf, A.
    ANNALS OF PHYSICS, 2008, 323 (04) : 946 - 988
  • [4] Dynamo transition in low-dimensional models
    Verma, Mahendra K.
    Lessinnes, Thomas
    Carati, Daniele
    Sarris, Ioannis
    Kumar, Krishna
    Singh, Meenakshi
    PHYSICAL REVIEW E, 2008, 78 (03):
  • [5] Learning Low-Dimensional Signal Models
    Carin, Lawrence
    Baraniuk, Richard G.
    Cevher, Volkan
    Dunson, David
    Jordan, Michael I.
    Sapiro, Guillermo
    Wakin, Michael B.
    IEEE SIGNAL PROCESSING MAGAZINE, 2011, 28 (02) : 39 - 51
  • [6] Efficient projection onto a low-dimensional ball
    Teal, Paul D.
    Krishnan, Lakshmi
    Betlehem, Terence
    ENGINEERING OPTIMIZATION, 2019, 51 (03) : 537 - 548
  • [7] Low-Dimensional Genotype Embeddings for Predictive Models
    Sultan, Syed Fahad
    Guo, Xingzhi
    Skiena, Steven
    13TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, BCB 2022, 2022,
  • [8] Low-dimensional models of single neurons: a review
    Chialva, Ulises
    Gonzalez Bosca, Vicente
    Rotstein, Horacio G.
    BIOLOGICAL CYBERNETICS, 2023, 117 (03) : 163 - 183
  • [9] Low-dimensional models of stellar and galactic dynamos
    Sokoloff, Dimitry
    Nefyodov, S.
    COSMIC MAGNETIC FIELDS: FROM PLANETS, TO STARS AND GALAXIES, 2009, (259): : 419 - 420
  • [10] Low-Dimensional Models for Aerofoil Icing Predictions
    Massegur, David
    Clifford, Declan
    Da Ronch, Andrea
    Lombardi, Riccardo
    Panzeri, Marco
    AEROSPACE, 2023, 10 (05)