Dynamic Low-rank Estimation for Transformer-based Language Models

被引:0
|
作者
Huai, Ting [1 ]
Lie, Xiao [2 ]
Gao, Shangqian [1 ]
Hsu, Yenchang [2 ]
Shen, Yilin [2 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
[2] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Matrix decomposition methods, such as Singular Value Decomposition (SVD) and its importance-weighted variants, have been widely used for compressing Transformerbased language models. While importanceweighted decomposition methods alleviate the strong assumption of equal importance for each parameter in SVD, they still rely on two fundamental assumptions: 1) unchanged importance distribution during further fine-tuning, 2) equal importance across weight matrices in different layers. Furthermore, these methods necessitate a well-trained task-specific model as the starting point and require additional fine-tuning after compression. In this work, we proposed RankDyna, a matrix decomposition method that enables dynamic rank resource allocation among matrices across different layers during the training process. Starting from a general pre-trained model, RankDyna accomplishes the dual goals of compression and adaptation to the downstream task, all within a single round of fine-tuning. The extensive evaluations demonstrate that RankDyna can outperform current SOTA methods under various parameter budget levels, and the advantage of RankDyna is further enhanced with higher compression rates.
引用
收藏
页码:9275 / 9287
页数:13
相关论文
共 50 条
  • [41] Enhancing Address Data Integrity using Transformer-Based Language Models
    Kurklu, Omer Faruk
    Akagiunduz, Erdem
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [42] Facial Age Estimation Based on Structured Low-rank Representation
    Yan, Chenjing
    Lang, Congyan
    Feng, Songhe
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1207 - 1210
  • [43] Bootstrap-Based Regularization for Low-Rank Matrix Estimation
    Josse, Julie
    Wager, Stefan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [44] Low-Rank Matrix Completion Based on Maximum Likelihood Estimation
    Chen, Jinhui
    Yang, Jian
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 261 - 265
  • [45] LOW-RANK APPROXIMATIONS FOR DYNAMIC IMAGING
    Haldar, Justin P.
    Liang, Zhi-Pei
    2011 8TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2011, : 1052 - 1055
  • [46] Learning Mixtures of Low-Rank Models
    Chen, Yanxi
    Ma, Cong
    Poor, H. Vincent
    Chen, Yuxin
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (07) : 4613 - 4636
  • [47] Efficient Low-rank Backpropagation for Vision Transformer Adaptation
    Yang, Yuedong
    Chiang, Hung-Yueh
    Li, Guihong
    Marculescu, Diana
    Marculescu, Radu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] Low-Rank Tensor Decomposition Based Dynamic Network Tracking
    Zoltowski, David M.
    Aviyente, Selin
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 468 - 472
  • [49] Derivative-Free Optimization for Low-Rank Adaptation in Large Language Models
    Jin, Feihu
    Liu, Yifan
    Tan, Ying
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4607 - 4616
  • [50] Low-rank and global-representation-key-based attention for graph transformer
    Kong, Lingping
    Ojha, Varun
    Gao, Ruobin
    Suganthan, Ponnuthurai Nagaratnam
    Snasel, Vaclav
    INFORMATION SCIENCES, 2023, 642