Adder Attention for Vision Transformer

被引:0
|
作者
Shu, Han [1 ]
Wang, Jiahao [2 ]
Chen, Hanting [1 ,3 ]
Li, Lin [4 ]
Yang, Yujiu [2 ]
Wang, Yunhe [1 ]
机构
[1] Huawei Noahs Ark Lab, Hong Kong, Peoples R China
[2] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[3] Peking Univ, Beijing, Peoples R China
[4] Huawei Technol, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer is a new kind of calculation paradigm for deep learning which has shown strong performance on a large variety of computer vision tasks. However, compared with conventional deep models (e.g., convolutional neural networks), vision transformers require more computational resources which cannot be easily deployed on mobile devices. To this end, we present to reduce the energy consumptions using adder neural network (AdderNet). We first theoretically analyze the mechanism of self-attention and the difficulty for applying adder operation into this module. Specifically, the feature diversity, i.e., the rank of attention map using only additions cannot be well preserved. Thus, we develop an adder attention layer that includes an additional identity mapping. With the new operation, vision transformers constructed using additions can also provide powerful feature representations. Experimental results on several benchmarks demonstrate that the proposed approach can achieve highly competitive performance to that of the baselines while achieving an about 2(similar to)3x reduction on the energy consumption.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Vision Transformer With Quadrangle Attention
    Zhang, Qiming
    Zhang, Jing
    Xu, Yufei
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3608 - 3624
  • [2] Vision Transformer with Deformable Attention
    Xia, Zhuofan
    Pan, Xuran
    Song, Shiji
    Li, Li Erran
    Huang, Gao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4784 - 4793
  • [3] CoAtFormer: Vision Transformer with Composite Attention
    Chang, Zhiyong
    Yin, Mingjun
    Wang, Yan
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 614 - 622
  • [4] FLatten Transformer: Vision Transformer using Focused Linear Attention
    Han, Dongchen
    Pan, Xuran
    Han, Yizeng
    Song, Shiji
    Huang, Gao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5938 - 5948
  • [5] Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer
    Li, Nannan
    Chen, Yaran
    Zhao, Dongbin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] ATTENTION PROBE: VISION TRANSFORMER DISTILLATION IN THE WILD
    Wang, Jiahao
    Cao, Mingdeng
    Shi, Shuwei
    Wu, Baoyuan
    Yang, Yujiu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2220 - 2224
  • [7] Fast Vision Transformer via Additive Attention
    Wen, Yang
    Chen, Samuel
    Shrestha, Abhishek Krishna
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 573 - 574
  • [8] Couplformer: Rethinking Vision Transformer with Coupling Attention
    Lan, Hai
    Wang, Xihao
    Shen, Hao
    Liang, Peidong
    Wei, Xian
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6464 - 6473
  • [9] Lite Vision Transformer with Enhanced Self-Attention
    Yang, Chenglin
    Wang, Yilin
    Zhang, Jianming
    Zhang, He
    Wei, Zijun
    Lin, Zhe
    Yuille, Alan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11988 - 11998
  • [10] Attention combined pyramid vision transformer for polyp segmentation
    Liu, Xiaogang
    Song, Shuang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89