MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

被引：0

作者：

Hao, Jitai ^{[1
]}

Sun, Weiwei ^{[1
,2
]}

Xin, Xin ^{[1
]}

Meng, Qi ^{[3
]}

Chen, Zhumin ^{[1
]}

Ren, Pengjie ^{[1
]}

Ren, Zhaochun ^{[4
]}

机构：

[1] Shandong Univ, Qingdao, Peoples R China

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Acad Math & Syst Sci, Beijing, Peoples R China

[4] Leiden Univ, Leiden, Netherlands

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

基金：

国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that fine-tunes LLMs with adapters of larger size yet memory-efficient. This is achieved by leveraging the inherent activation sparsity in the Feed-Forward Networks (FFNs) of LLMs and utilizing the larger capacity of Central Processing Unit (CPU) memory compared to Graphics Processing Unit (GPU). We store and update the parameters of larger adapters on the CPU. Moreover, we employ a Mixture of Experts (MoE)-like architecture to mitigate unnecessary CPU computations and reduce the communication volume between the GPU and CPU. This is particularly beneficial over the limited bandwidth of PCI Express (PCIe). Our method can achieve fine-tuning results comparable to those obtained with larger memory capacities, even when operating under more limited resources such as a 24GB memory single GPU setup, with acceptable loss in training efficiency. Our codes are available at https://github.com/CURRENTF/MEFT.

引用

页码：2375 / 2388

页数：14

共 50 条

[41] Fine-Tuning the Fuzziness of Strong Fuzzy Partitions through PSO
Castiello, Ciro
Mencar, Corrado
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2020, 13 (01) : 1415 - 1428
[42] Fine-Tuning Development Through Antagonistic Peptides: An Emerging Theme
Lee, Jin Suk
De Smet, Ive
TRENDS IN PLANT SCIENCE, 2016, 21 (12) : 991 - 993
[43] Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning
Zhang, Zhen-Ru
Tan, Chuanqi
Xu, Haiyang
Wang, Chengyu
Huang, Jun
Huang, Songfang
61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1239 - 1248
[44] Polar opposites: Fine-tuning cytokinesis through SIN asymmetry
Johnson, Alyssa E.
McCollum, Dannel
Gould, Kathleen L.
CYTOSKELETON, 2012, 69 (10) : 686 - 699
[45] Color Fine-Tuning of Optical Materials Through Rational Design
Holzer, Brigitte
Bintinger, Johannes
Lumpi, Daniel
Choi, Christopher
Kim, Youngwan
Stoeger, Berthold
Hametner, Christian
Marchetti-Deschmann, Martina
Plasser, Felix
Horkel, Ernst
Kymissis, Ioannis
Froehlich, Johannes
CHEMPHYSCHEM, 2017, 18 (05) : 549 - 563
[46] Parameters Efficient Fine-Tuning for Long-Tailed Sequential Recommendation
Lv, Zheqi
Wang, Feng
Zhang, Shengyu
Zhang, Wenqiao
Kuang, Kun
Wu, Fei
ARTIFICIAL INTELLIGENCE, CICAI 2023, PT I, 2024, 14473 : 442 - 459
[47] Democratizing protein language models with parameter-efficient fine-tuning
Sledzieski, Samuel
Kshirsagar, Meghana
Baek, Minkyung
Dodhia, Rahul
Ferres, Juan Lavista
Berger, Bonnie
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2024, 121 (26)
[48] AutoPEFT : Automatic Configuration Search for Parameter-Efficient Fine-Tuning
Zhou, Han
Wan, Xingchen
Vulic, Ivan
Korhonen, Anna
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 525 - 542
[49] Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning
He, Haoyu
Cai, Jianfei
Zhang, Jing
Tao, Dacheng
Zhuang, Bohan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11791 - 11801
[50] LLAMAFACTORY: Unified Efficient Fine-Tuning of 100+Language Models
Zheng, Yaowei
Zhang, Richong
Zhang, Junhao
Ye, Yanhan
Luo, Zheyan
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 400 - 410

← 1 2 3 4 5 →