MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

被引：0

作者：

Hao, Jitai ^{[1
]}

Sun, Weiwei ^{[1
,2
]}

Xin, Xin ^{[1
]}

Meng, Qi ^{[3
]}

Chen, Zhumin ^{[1
]}

Ren, Pengjie ^{[1
]}

Ren, Zhaochun ^{[4
]}

机构：

[1] Shandong Univ, Qingdao, Peoples R China

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Acad Math & Syst Sci, Beijing, Peoples R China

[4] Leiden Univ, Leiden, Netherlands

来源：

PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS | 2024年

基金：

国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that fine-tunes LLMs with adapters of larger size yet memory-efficient. This is achieved by leveraging the inherent activation sparsity in the Feed-Forward Networks (FFNs) of LLMs and utilizing the larger capacity of Central Processing Unit (CPU) memory compared to Graphics Processing Unit (GPU). We store and update the parameters of larger adapters on the CPU. Moreover, we employ a Mixture of Experts (MoE)-like architecture to mitigate unnecessary CPU computations and reduce the communication volume between the GPU and CPU. This is particularly beneficial over the limited bandwidth of PCI Express (PCIe). Our method can achieve fine-tuning results comparable to those obtained with larger memory capacities, even when operating under more limited resources such as a 24GB memory single GPU setup, with acceptable loss in training efficiency. Our codes are available at https://github.com/CURRENTF/MEFT.

引用

页码：2375 / 2388

页数：14

共 50 条

[31] Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
Zhang, Zhengxin
Zhao, Dan
Miao, Xupeng
Oliaro, Gabriele
Zhang, Zhihao
Li, Qing
Jiang, Yong
Jia, Zhihao
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1 - 17
[32] Tree Prompting: Efficient Task Adaptation without Fine-Tuning
Morris, John X.
Singh, Chandan
Rush, Alexander M.
Gao, Jianfeng
Deng, Yuntian
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6253 - 6267
[33] Mechanically responsive crystals: tuning flexibilty through fine-tuning intermolecular interactions
Dakovic, M.
Pisacic, M.
Misura, O.
ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2022, 78 : E191 - E191
[34] DCFT: Dependency-aware continual learning fine-tuning for sparse LLMs
Wang, Yanzhe
Wang, Yizhen
Yin, Baoqun
NEUROCOMPUTING, 2025, 636
[35] HPCache: Memory-Efficient OLAP Through Proportional Caching
Nicholson, Hamish
Chrysogelos, Periklis
Ailamaki, Anastasia
18TH INTERNATIONAL WORKSHOP ON DATA MANAGEMENT ON NEW HARDWARE, DAMON 2022, 2022,
[36] Fine-tuning of proteasome inhibitors through rational pathway engineering
Baunach, Martin
CHEM, 2024, 10 (10):
[37] Fine-Tuning Stomatal Movement Through Small Signaling Peptides
Qu, Xinyun
Cao, Bing
Kang, Jingke
Wang, Xuening
Han, Xiangyu
Jiang, Wenqian
Shi, Xiong
Zhang, Luosha
Cui, Langjun
Hu, Zhubing
Zhang, Yonghong
Wang, Guodong
FRONTIERS IN PLANT SCIENCE, 2019, 10
[38] Fine-Tuning the Energetic Properties of Complexes through Ligand Modification
Zhang, Ji-Chuan
Su, Hui
Guo, Shu
Dong, Ya-Lu
Zhang, Shao-Wen
Zou, Tao
Li, Sheng-Hua
Pang, Si-Ping
CRYSTAL GROWTH & DESIGN, 2018, 18 (04) : 2217 - 2224
[39] Fine-Tuning the Fuzziness of Strong Fuzzy Partitions through PSO
Ciro Castiello
Corrado Mencar
International Journal of Computational Intelligence Systems, 2020, 13 : 1415 - 1428
[40] Fine-tuning nicotinic receptor function through the lipid bilayer
Baenziger, JE
da Costa, CJB
Goodreid, M
FEBS JOURNAL, 2005, 272 : 234 - 234

← 1 2 3 4 5 →