Efficient Low-Dimensional Compression of Overparameterized Models

被引:0
|
作者
Kwon, Soo Min [1 ]
Zhang, Zekai [2 ]
Song, Dogyoon [1 ]
Balzano, Laura [1 ]
Qu, Qing [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Tsinghua Univ, Beijing, Peoples R China
关键词
RANK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present a novel approach for compressing overparameterized models, developed through studying their learning dynamics. We observe that for many deep models, updates to the weight matrices occur within a low-dimensional invariant subspace. For deep linear models, we demonstrate that their principal components are fitted incrementally within a small subspace, and use these insights to propose a compression algorithm for deep linear networks that involve decreasing the width of their intermediate layers. We empirically evaluate the effectiveness of our compression technique on matrix recovery problems. Remarkably, by using an initialization that exploits the structure of the problem, we observe that our compressed network converges faster than the original network, consistently yielding smaller recovery errors. We substantiate this observation by developing a theory focused on deep matrix factorization. Finally, we empirically demonstrate how our compressed model has the potential to improve the utility of deep nonlinear models. Overall, our algorithm improves the training efficiency by more than 2x, without compromising generalization.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Low-dimensional models for compressible temporally developing shear layers
    Qawasmeh, Bashar R.
    Wei, Mingjun
    JOURNAL OF FLUID MECHANICS, 2013, 731 : 364 - 393
  • [42] Efimov-Like Behaviour in Low-Dimensional Polymer Models
    Mura, Federica
    Bhattacharjee, Somendra M.
    Maji, Jaya
    Masetto, Mario
    Seno, Flavio
    Trovato, Antonio
    JOURNAL OF LOW TEMPERATURE PHYSICS, 2016, 185 (1-2) : 102 - 121
  • [43] Grasp synthesis from low-dimensional probabilistic grasp models
    Ben Amor, Heni
    Heumer, Guido
    Jung, Bernhard
    Vitzthum, Arnd
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2008, 19 (3-4) : 445 - 454
  • [44] LOW-DIMENSIONAL MODELS FOR MISSING DATA IMPUTATION IN ROAD NETWORKS
    Asif, Muhammad Tayyab
    Mitrovic, Nikola
    Garg, Lalit
    Dauwels, Justin
    Jaillet, Patrick
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3527 - 3531
  • [45] Low-dimensional, morphologically accurate models of subthreshold membrane potential
    Anthony R. Kellems
    Derrick Roos
    Nan Xiao
    Steven J. Cox
    Journal of Computational Neuroscience, 2009, 27
  • [46] Efimov-Like Behaviour in Low-Dimensional Polymer Models
    Federica Mura
    Somendra M. Bhattacharjee
    Jaya Maji
    Mario Masetto
    Flavio Seno
    Antonio Trovato
    Journal of Low Temperature Physics, 2016, 185 : 102 - 121
  • [47] Speech modeling and processing by low-dimensional dynamic glottal models
    Drioli, Carlo
    Calanca, Andrea
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1606 - 1609
  • [48] Low-dimensional, morphologically accurate models of subthreshold membrane potential
    Kellems, Anthony R.
    Roos, Derrick
    Xiao, Nan
    Cox, Steven J.
    JOURNAL OF COMPUTATIONAL NEUROSCIENCE, 2009, 27 (02) : 161 - 176
  • [49] Comment on "Low-dimensional models for vertically falling viscous films"
    Ruyer-Quil, C
    Manneville, P
    PHYSICAL REVIEW LETTERS, 2004, 93 (19) : 199401 - 1
  • [50] Chaos in low-dimensional Lotka-Volterra models of competition
    Vano, J. A.
    Wildenberg, J. C.
    Anderson, M. B.
    Noel, J. K.
    Sprott, J. C.
    NONLINEARITY, 2006, 19 (10) : 2391 - 2404