Efficient transformer tracking with adaptive attention

被引:0
|
作者
Xiao, Dingkun [1 ]
Wei, Zhenzhong [1 ]
Zhang, Guangjun [1 ]
机构
[1] Beihang Univ, Sch Instrumentat & Optoelect Engn, Key Lab Precis Optomechatron Technol, Minist Educ, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
computer vision; convolution; convolutional neural nets; object tracking; target tracking; tracking;
D O I
10.1049/cvi2.12315
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, several trackers utilising Transformer architecture have shown significant performance improvement. However, the high computational cost of multi-head attention, a core component in the Transformer, has limited real-time running speed, which is crucial for tracking tasks. Additionally, the global mechanism of multi-head attention makes it susceptible to distractors with similar semantic information to the target. To address these issues, the authors propose a novel adaptive attention that enhances features through the spatial sparse attention mechanism with less than 1/4 of the computational complexity of multi-head attention. Our adaptive attention sets a perception range around each element in the feature map based on the target scale in the previous tracking result and adaptively searches for the information of interest. This allows the module to focus on the target region rather than background distractors. Based on adaptive attention, the authors build an efficient transformer tracking framework. It can perform deep interaction between search and template features to activate target information and aggregate multi-level interaction features to enhance the representation ability. The evaluation results on seven benchmarks show that the authors' tracker achieves outstanding performance with a speed of 43 fps and significant advantages in hard circumstances.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Graph attention information fusion for Siamese adaptive attention tracking
    Lixin Wei
    Zeyu Xi
    Ziyu Hu
    Hao Sun
    Applied Intelligence, 2023, 53 : 2068 - 2087
  • [22] Adaptive thresholding for visual attention and tracking systems
    Fish, A
    Yadid-Pecht, O
    OPTICAL ENGINEERING, 2004, 43 (06) : 1278 - 1279
  • [23] Consistent Weighted Correlation-Based Attention for Transformer Tracking
    Liu, Lei
    Fang, Genwen
    Wang, Jun
    Wang, Shuai
    Wang, Chun
    Shen, Longfeng
    Zhu, Kongfen
    Melo, Silas N.
    ELECTRONICS, 2023, 12 (22)
  • [24] Transformer tracking with multi-scale dual-attention
    Jun Wang
    Changwang Lai
    Wenshuang Zhang
    Yuanyun Wang
    Chenchen Meng
    Complex & Intelligent Systems, 2023, 9 : 5793 - 5806
  • [25] Transformer tracking with multi-scale dual-attention
    Wang, Jun
    Lai, Changwang
    Zhang, Wenshuang
    Wang, Yuanyun
    Meng, Chenchen
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (05) : 5793 - 5806
  • [26] A robust attention-enhanced network with transformer for visual tracking
    Fengwei Gu
    Jun Lu
    Chengtao Cai
    Multimedia Tools and Applications, 2023, 82 : 40761 - 40782
  • [27] DASFTOT: Dual attention spatiotemporal fused transformer for object tracking
    Wu, Ruixu
    Wen, Xianbin
    Yuan, Liming
    Xu, Haixia
    KNOWLEDGE-BASED SYSTEMS, 2022, 256
  • [28] A robust attention-enhanced network with transformer for visual tracking
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40761 - 40782
  • [29] A Gated Attention Transformer for Multi-Person Pose Tracking
    Doering, Andreas
    Gall, Juergen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3181 - 3190
  • [30] Transformer visual object tracking algorithm based on mixed attention
    Hou Z.-Q.
    Guo F.
    Yang X.-L.
    Ma S.-G.
    Fan J.-L.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 739 - 748