MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

被引:0
|
作者
Hao, Jitai [1 ]
Sun, Weiwei [1 ,2 ]
Xin, Xin [1 ]
Meng, Qi [3 ]
Chen, Zhumin [1 ]
Ren, Pengjie [1 ]
Ren, Zhaochun [4 ]
机构
[1] Shandong Univ, Qingdao, Peoples R China
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Acad Math & Syst Sci, Beijing, Peoples R China
[4] Leiden Univ, Leiden, Netherlands
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that fine-tunes LLMs with adapters of larger size yet memory-efficient. This is achieved by leveraging the inherent activation sparsity in the Feed-Forward Networks (FFNs) of LLMs and utilizing the larger capacity of Central Processing Unit (CPU) memory compared to Graphics Processing Unit (GPU). We store and update the parameters of larger adapters on the CPU. Moreover, we employ a Mixture of Experts (MoE)-like architecture to mitigate unnecessary CPU computations and reduce the communication volume between the GPU and CPU. This is particularly beneficial over the limited bandwidth of PCI Express (PCIe). Our method can achieve fine-tuning results comparable to those obtained with larger memory capacities, even when operating under more limited resources such as a 24GB memory single GPU setup, with acceptable loss in training efficiency. Our codes are available at https://github.com/CURRENTF/MEFT.
引用
收藏
页码:2375 / 2388
页数:14
相关论文
共 50 条
  • [31] Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
    Zhang, Zhengxin
    Zhao, Dan
    Miao, Xupeng
    Oliaro, Gabriele
    Zhang, Zhihao
    Li, Qing
    Jiang, Yong
    Jia, Zhihao
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1 - 17
  • [32] Tree Prompting: Efficient Task Adaptation without Fine-Tuning
    Morris, John X.
    Singh, Chandan
    Rush, Alexander M.
    Gao, Jianfeng
    Deng, Yuntian
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6253 - 6267
  • [33] Mechanically responsive crystals: tuning flexibilty through fine-tuning intermolecular interactions
    Dakovic, M.
    Pisacic, M.
    Misura, O.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2022, 78 : E191 - E191
  • [34] DCFT: Dependency-aware continual learning fine-tuning for sparse LLMs
    Wang, Yanzhe
    Wang, Yizhen
    Yin, Baoqun
    NEUROCOMPUTING, 2025, 636
  • [35] HPCache: Memory-Efficient OLAP Through Proportional Caching
    Nicholson, Hamish
    Chrysogelos, Periklis
    Ailamaki, Anastasia
    18TH INTERNATIONAL WORKSHOP ON DATA MANAGEMENT ON NEW HARDWARE, DAMON 2022, 2022,
  • [36] Fine-tuning of proteasome inhibitors through rational pathway engineering
    Baunach, Martin
    CHEM, 2024, 10 (10):
  • [37] Fine-Tuning Stomatal Movement Through Small Signaling Peptides
    Qu, Xinyun
    Cao, Bing
    Kang, Jingke
    Wang, Xuening
    Han, Xiangyu
    Jiang, Wenqian
    Shi, Xiong
    Zhang, Luosha
    Cui, Langjun
    Hu, Zhubing
    Zhang, Yonghong
    Wang, Guodong
    FRONTIERS IN PLANT SCIENCE, 2019, 10
  • [38] Fine-Tuning the Energetic Properties of Complexes through Ligand Modification
    Zhang, Ji-Chuan
    Su, Hui
    Guo, Shu
    Dong, Ya-Lu
    Zhang, Shao-Wen
    Zou, Tao
    Li, Sheng-Hua
    Pang, Si-Ping
    CRYSTAL GROWTH & DESIGN, 2018, 18 (04) : 2217 - 2224
  • [39] Fine-Tuning the Fuzziness of Strong Fuzzy Partitions through PSO
    Ciro Castiello
    Corrado Mencar
    International Journal of Computational Intelligence Systems, 2020, 13 : 1415 - 1428
  • [40] Fine-tuning nicotinic receptor function through the lipid bilayer
    Baenziger, JE
    da Costa, CJB
    Goodreid, M
    FEBS JOURNAL, 2005, 272 : 234 - 234