Hydra: Multi-head low-rank adaptation for parameter efficient fine-tuning

被引:3
|
作者
Kim, Sanghyeon [1 ]
Yang, Hyunmo [2 ]
Kim, Yunghyun [2 ]
Hong, Youngjoon [3 ]
Park, Eunbyung [1 ,2 ]
机构
[1] Sungkyunkwan Univ, Dept Elect & Comp Engn, 2066 Seobu Ro, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Artificial Intelligence, 2066 Seobu Ro, Suwon 16419, South Korea
[3] Korea Adv Inst Sci & Technol, Dept Math Sci, 291 Daehak Ro, Taejon 305701, South Korea
基金
新加坡国家研究基金会;
关键词
Parameter efficient fine-tuning; Adapter; Transformer; BENCHMARK;
D O I
10.1016/j.neunet.2024.106414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent surge in large-scale foundation models has spurred the development of efficient methods for adapting these models to various downstream tasks. Low-rank adaptation methods, such as LoRA, have gained significant attention due to their outstanding parameter efficiency and no additional inference latency. This paper investigates a more general form of adapter module based on the analysis that parallel and sequential adaptation branches learn novel and general features during fine-tuning, respectively. The proposed method, named Hydra, combines parallel and sequential branch to integrate capabilities, which is more expressive than existing single branch methods and enables the exploration of a broader range of optimal points in the finetuning process. In addition, the proposed method explicitly leverages the pre-trained weights by performing a linear combination of the pre-trained features. It allows the learned features to have better generalization performance across diverse downstream tasks. Furthermore, we perform a comprehensive analysis of the characteristics of each adaptation branch with empirical evidence. Through an extensive range of experiments, we substantiate the efficiency and demonstrate the superior performance of Hydra. This comprehensive evaluation underscores the potential impact and effectiveness of Hydra in a variety of applications. The source code of this work is publicly opened on https://github.com/extremebird/Hydra.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] " Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning" [Neural Networks Volume 178, October (2024), 1-11/106414]]
    Kim, Sanghyeon
    Yang, Hyunmo
    Kim, Younghyun
    Hong, Youngjoon
    Park, Eunbyung
    NEURAL NETWORKS, 2025, 181
  • [2] Structure-Aware Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
    Hu, Yahao
    Xie, Yifei
    Wang, Tianfeng
    Chen, Man
    Pan, Zhisong
    MATHEMATICS, 2023, 11 (20)
  • [3] LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
    Zhang, Mingyang
    Chen, Hao
    Shen, Chunhua
    Yang, Zhen
    Ou, Linlin
    Yu, Xinyi
    Zhuang, Bohan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 3013 - 3026
  • [4] Dropout Mixture Low-Rank Adaptation for Visual Parameters-Efficient Fine-Tuning
    Fang, Zhengyi
    Wang, Yue
    Yi, Ran
    Ma, Lizhuang
    COMPUTER VISION-ECCV 2024, PT VII, 2025, 15065 : 369 - 386
  • [5] Leveraging Low-Rank Adaptation for Parameter-Efficient Fine-Tuning in Multi-Speaker Adaptive Text-to-Speech Synthesis
    Hong, Changi
    Lee, Jung Hyuk
    Kim, Hong Kook
    IEEE ACCESS, 2024, 12 : 190711 - 190727
  • [6] AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
    Lin, Zeyu
    Kundu, Souvik
    Li, Anni
    Wan, Junrui
    Jiang, Lianghao
    Beerell, Peter A.
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 161 - 167
  • [7] Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks
    Qui, Tingyu
    Tuytelaars, Tinne
    Moens, Marie-Francine
    COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 : 291 - 308
  • [8] Federated Low-Rank Adaptation for Large Models Fine-Tuning Over Wireless Networks
    Sun, Haofeng
    Tian, Hui
    Ni, Wanli
    Zheng, Jingheng
    Niyato, Dusit
    Zhang, Ping
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2025, 24 (01) : 659 - 675
  • [9] Low-Rank Bottleneck in Multi-head Attention Models
    Bhojanapalli, Srinadh
    Yun, Chulhee
    Rawat, Ankit Singh
    Reddi, Sashank
    Kumar, Sanjiv
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [10] Efficient Fine-Tuning of Large Language Models via a Low-Rank Gradient Estimator
    Zhang, Luoming
    Lou, Zhenyu
    Ying, Yangwei
    Yang, Cheng
    Zhou, Hong
    APPLIED SCIENCES-BASEL, 2025, 15 (01):