Efficient transformer tracking with adaptive attention

被引:0
|
作者
Xiao, Dingkun [1 ]
Wei, Zhenzhong [1 ]
Zhang, Guangjun [1 ]
机构
[1] Beihang Univ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; convolution; convolutional neural nets; object tracking; target tracking; tracking;
D O I
10.1049/cvi2.12315
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, several trackers utilising Transformer architecture have shown significant performance improvement. However, the high computational cost of multi-head attention, a core component in the Transformer, has limited real-time running speed, which is crucial for tracking tasks. Additionally, the global mechanism of multi-head attention makes it susceptible to distractors with similar semantic information to the target. To address these issues, the authors propose a novel adaptive attention that enhances features through the spatial sparse attention mechanism with less than 1/4 of the computational complexity of multi-head attention. Our adaptive attention sets a perception range around each element in the feature map based on the target scale in the previous tracking result and adaptively searches for the information of interest. This allows the module to focus on the target region rather than background distractors. Based on adaptive attention, the authors build an efficient transformer tracking framework. It can perform deep interaction between search and template features to activate target information and aggregate multi-level interaction features to enhance the representation ability. The evaluation results on seven benchmarks show that the authors' tracker achieves outstanding performance with a speed of 43 fps and significant advantages in hard circumstances.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Adaptive sparse attention-based compact transformer for object tracking
    Pan, Fei
    Zhao, Lianyu
    Wang, Chenglin
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [2] Cross-Parallel Attention and Efficient Match Transformer for Aerial Tracking
    Deng, Anping
    Han, Guangliang
    Zhang, Zhongbo
    Chen, Dianbing
    Ma, Tianjiao
    Liu, Zhichao
    REMOTE SENSING, 2024, 16 (06)
  • [3] AiATrack: Attention in Attention for Transformer Visual Tracking
    Gao, Shenyuan
    Zhou, Chunluan
    Ma, Chao
    Wang, Xinggang
    Yuan, Junsong
    COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 : 146 - 164
  • [4] Efficient Visual Tracking Using Local Information Patch Attention Free Transformer
    Wang, Pin-Feng
    Tang, Chih-Wei
    2022 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN, IEEE ICCE-TW 2022, 2022, : 447 - 448
  • [5] ATFTrans: attention-weighted token fusion transformer for robust and efficient object tracking
    Liang Xu
    Liejun Wang
    Zhiqing Guo
    Neural Computing and Applications, 2024, 36 : 7043 - 7056
  • [6] An efficient object tracking based on multi-head cross-attention transformer
    Dai, Jiahai
    Li, Huimin
    Jiang, Shan
    Yang, Hongwei
    EXPERT SYSTEMS, 2025, 42 (02)
  • [7] ATFTrans: attention-weighted token fusion transformer for robust and efficient object tracking
    Xu, Liang
    Wang, Liejun
    Guo, Zhiqing
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7043 - 7056
  • [8] MTAtrack: Multilevel transformer attention for visual tracking
    An, Dong
    Zhang, Fan
    Zhao, Yuqian
    Luo, Biao
    Yang, Chunhua
    Chen, Baifan
    Yu, Lingli
    OPTICS AND LASER TECHNOLOGY, 2023, 166
  • [9] Transformer Tracking with Cyclic Shifting Window Attention
    Song, Zikai
    Yu, Junqing
    Chen, Yi-Ping Phoebe
    Yang, Wei
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8781 - 8790
  • [10] SCATT: Transformer tracking with symmetric cross-attention
    Zhang, Jianming
    Chen, Wentao
    Dai, Jiangxin
    Zhang, Jin
    APPLIED INTELLIGENCE, 2024, 54 (08) : 6069 - 6084