ATFTrans: attention-weighted token fusion transformer for robust and efficient object tracking

被引:2
|
作者
Xu, Liang [1 ]
Wang, Liejun [1 ]
Guo, Zhiqing [1 ]
机构
[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830000, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 13期
基金
中国国家自然科学基金;
关键词
Fully transformer-based tracker; Token fusion; Information loss; Efficient inference;
D O I
10.1007/s00521-024-09444-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, fully transformer-based trackers have achieved impressive tracking results, but this also brings a great deal of computational complexity. Some researchers have applied token pruning techniques to fully transformer-based trackers to diminish the computational complexity, but this leads to missing contextual information that is important for the regression task in the tracker. In response to the above issue, this paper proposes a token fusion method that speeds up inference while avoiding information loss and thus improving the robustness of the tracker. Specifically, the input of the transformer's encoder contains search tokens and exemplar tokens, and the search tokens are divided into tracking object tokens and background tokens according to the similarity between search tokens and exemplar tokens. The tokens with greater similarity to the exemplar tokens are identified as tracking object tokens, and those with smaller similarity to the exemplar tokens are identified as background tokens. The tracking object tokens contain the discriminative features of the tracking object, for the sake of making the tracker pay more attention to the tracking object tokens while reducing the computational effort. All the tracking object tokens are kept, and then, the background tokens are weighted and fused to form new background tokens according to the attention weight of the background tokens to prevent the loss of contextual information. The token fusion method presented in this paper not only provides efficient inference of the tracker but also makes the tracker more robust. Extensive experiments are carried out on popular tracking benchmark datasets to verify the validity of the token fusion method.
引用
收藏
页码:7043 / 7056
页数:14
相关论文
共 50 条
  • [1] ATFTrans: attention-weighted token fusion transformer for robust and efficient object tracking
    Liang Xu
    Liejun Wang
    Zhiqing Guo
    Neural Computing and Applications, 2024, 36 : 7043 - 7056
  • [2] Combining Swin Transformer and Attention-Weighted Fusion for Scene Text Detection
    Xianguo Li
    Xingchen Yao
    Yi Liu
    Neural Processing Letters, 56
  • [3] Combining Swin Transformer and Attention-Weighted Fusion for Scene Text Detection
    Li, Xianguo
    Yao, Xingchen
    Liu, Yi
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [4] GCAT: graph calibration attention transformer for robust object tracking
    Chen S.
    Hu X.
    Wang D.-H.
    Yan Y.
    Zhu S.
    Neural Computing and Applications, 2024, 36 (23) : 14151 - 14172
  • [5] Partitioned token fusion and pruning strategy for transformer tracking
    Zhang, Chi
    Gao, Yun
    Meng, Tao
    Wang, Tao
    IMAGE AND VISION COMPUTING, 2025, 154
  • [6] Efficient transformer tracking with adaptive attention
    Xiao, Dingkun
    Wei, Zhenzhong
    Zhang, Guangjun
    IET COMPUTER VISION, 2024,
  • [7] FEATURE FUSION FOR ROBUST OBJECT TRACKING
    Islam, M. A.
    Rasheduzzaman, M.
    Elahi, M. M. Lutfe
    Poon, Bruce
    Amin, M. Ashraful
    Yan, Hong
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2015, : 138 - 145
  • [8] An efficient object tracking based on multi-head cross-attention transformer
    Dai, Jiahai
    Li, Huimin
    Jiang, Shan
    Yang, Hongwei
    EXPERT SYSTEMS, 2025, 42 (02)
  • [9] Efficient Siamese model for visual object tracking with attention-based fusion modules
    Zhou, Wenjun
    Liu, Yao
    Wang, Nan
    Liang, Dong
    Peng, Bo
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 7801 - 7810
  • [10] Weighted feature fusion and attention mechanism for object detection
    Cheng, Yanhao
    Liu, Weibin
    Xing, Weiwei
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (02)