Efficient Low-rank Backpropagation for Vision Transformer Adaptation

被引:0
|
作者
Yang, Yuedong [1 ]
Chiang, Hung-Yueh [1 ]
Li, Guihong [1 ]
Marculescu, Diana [1 ]
Marculescu, Radu [1 ]
机构
[1] Univ Texas Austin, Chandra Family Dept Elect & Comp Engn, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increasing scale of vision transformers (ViT) has made the efficient fine-tuning of these large models for specific needs a significant challenge in various applications. This issue originates from the computationally demanding matrix multiplications required during the backpropagation process through linear layers in ViT. In this paper, we tackle this problem by proposing a new Low-rank Back-Propagation viaWalsh-Hadamard Transformation (LBP-WHT) method. Intuitively, LBP-WHT projects the gradient into a low-rank space and carries out backpropagation. This approach substantially reduces the computation needed for adapting ViT, as matrix multiplication in the low-rank space is far less resource-intensive. We conduct extensive experiments with different models (ViT, hybrid convolution-ViT model) on multiple datasets to demonstrate the effectiveness of our method. For instance, when adapting an EfficientFormer-L1 model on CIFAR100, our LBP-WHT achieves 10.4% higher accuracy than the state-of-the-art baseline, while requiring 9 MFLOPs less computation. As the first work to accelerate ViT adaptation with low-rank backpropagation, our LBP-WHT method is complementary to many prior efforts and can be combined with them for better performance. Code: https://github.com/SLDGroup/LBP- WHT
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Low-Rank Transformer for High-Resolution Hyperspectral Computational Imaging
    Liu, Yuanye
    Dian, Renwei
    Li, Shutao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 809 - 824
  • [42] LORTSAR: Low-Rank Transformer for Skeleton-Based Action Recognition
    Oraki, Soroush
    Zhuang, Harry
    Liang, Jie
    ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I, 2025, 15046 : 196 - 207
  • [43] Low-Rank Prompt-Guided Transformer for Hyperspectral Image Denoising
    Tan, Xiaodong
    Shao, Mingwen
    Qiao, Yuanjian
    Liu, Tiyao
    Cao, Xiangyong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [44] LRTD: A Low-rank Transformer with Dynamic Depth and Width for Speech Recognition
    Yu, Fan
    Xi, Wei
    Yang, Zhao
    Tong, Ziye
    Sun, Jingtong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [45] Dynamic Low-rank Estimation for Transformer-based Language Models
    Huai, Ting
    Lie, Xiao
    Gao, Shangqian
    Hsu, Yenchang
    Shen, Yilin
    Jin, Hongxia
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9275 - 9287
  • [46] Ayaka: A Versatile Transformer Accelerator With Low-Rank Estimation and Heterogeneous Dataflow
    Qin, Yubin
    Wang, Yang
    Deng, Dazheng
    Yang, Xiaolong
    Zhao, Zhiren
    Zhou, Yang
    Fan, Yuanqi
    Wei, Jingchuan
    Chen, Tianbao
    Liu, Leibo
    Wei, Shaojun
    Hu, Yang
    Yin, Shouyi
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (10) : 3342 - 3356
  • [47] From low-rank retractions to dynamical low-rank approximation and back
    Seguin, Axel
    Ceruti, Gianluca
    Kressner, Daniel
    BIT NUMERICAL MATHEMATICS, 2024, 64 (03)
  • [48] Low-rank and sparse matrices fitting algorithm for low-rank representation
    Zhao, Jianxi
    Zhao, Lina
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2020, 79 (02) : 407 - 425
  • [49] Low-rank Parareal: a low-rank parallel-in-time integrator
    Benjamin Carrel
    Martin J. Gander
    Bart Vandereycken
    BIT Numerical Mathematics, 2023, 63
  • [50] Low-rank Parareal: a low-rank parallel-in-time integrator
    Carrel, Benjamin
    Gander, Martin J.
    Vandereycken, Bart
    BIT NUMERICAL MATHEMATICS, 2023, 63 (01)