Dynamic Low-rank Estimation for Transformer-based Language Models

被引：0

作者：

Huai, Ting ^{[1
]}

Lie, Xiao ^{[2
]}

Gao, Shangqian ^{[1
]}

Hsu, Yenchang ^{[2
]}

Shen, Yilin ^{[2
]}

Jin, Hongxia ^{[1
]}

机构：

[1] Samsung Res Amer, Mountain View, CA 94043 USA

[2] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Matrix decomposition methods, such as Singular Value Decomposition (SVD) and its importance-weighted variants, have been widely used for compressing Transformerbased language models. While importanceweighted decomposition methods alleviate the strong assumption of equal importance for each parameter in SVD, they still rely on two fundamental assumptions: 1) unchanged importance distribution during further fine-tuning, 2) equal importance across weight matrices in different layers. Furthermore, these methods necessitate a well-trained task-specific model as the starting point and require additional fine-tuning after compression. In this work, we proposed RankDyna, a matrix decomposition method that enables dynamic rank resource allocation among matrices across different layers during the training process. Starting from a general pre-trained model, RankDyna accomplishes the dual goals of compression and adaptation to the downstream task, all within a single round of fine-tuning. The extensive evaluations demonstrate that RankDyna can outperform current SOTA methods under various parameter budget levels, and the advantage of RankDyna is further enhanced with higher compression rates.

引用

页码：9275 / 9287

页数：13

共 50 条

[41] Enhancing Address Data Integrity using Transformer-Based Language Models
Kurklu, Omer Faruk
Akagiunduz, Erdem
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[42] Facial Age Estimation Based on Structured Low-rank Representation
Yan, Chenjing
Lang, Congyan
Feng, Songhe
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1207 - 1210
[43] Bootstrap-Based Regularization for Low-Rank Matrix Estimation
Josse, Julie
Wager, Stefan
JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
[44] Low-Rank Matrix Completion Based on Maximum Likelihood Estimation
Chen, Jinhui
Yang, Jian
2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 261 - 265
[45] LOW-RANK APPROXIMATIONS FOR DYNAMIC IMAGING
Haldar, Justin P.
Liang, Zhi-Pei
2011 8TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2011, : 1052 - 1055
[46] Learning Mixtures of Low-Rank Models
Chen, Yanxi
Ma, Cong
Poor, H. Vincent
Chen, Yuxin
IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (07) : 4613 - 4636
[47] Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Yang, Yuedong
Chiang, Hung-Yueh
Li, Guihong
Marculescu, Diana
Marculescu, Radu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[48] Low-Rank Tensor Decomposition Based Dynamic Network Tracking
Zoltowski, David M.
Aviyente, Selin
2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 468 - 472
[49] Derivative-Free Optimization for Low-Rank Adaptation in Large Language Models
Jin, Feihu
Liu, Yifan
Tan, Ying
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4607 - 4616
[50] Low-rank and global-representation-key-based attention for graph transformer
Kong, Lingping
Ojha, Varun
Gao, Ruobin
Suganthan, Ponnuthurai Nagaratnam
Snasel, Vaclav
INFORMATION SCIENCES, 2023, 642

← 1 2 3 4 5 →