ATFTrans: attention-weighted token fusion transformer for robust and efficient object tracking

被引:2
|
作者
Xu, Liang [1 ]
Wang, Liejun [1 ]
Guo, Zhiqing [1 ]
机构
[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830000, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 13期
基金
中国国家自然科学基金;
关键词
Fully transformer-based tracker; Token fusion; Information loss; Efficient inference;
D O I
10.1007/s00521-024-09444-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, fully transformer-based trackers have achieved impressive tracking results, but this also brings a great deal of computational complexity. Some researchers have applied token pruning techniques to fully transformer-based trackers to diminish the computational complexity, but this leads to missing contextual information that is important for the regression task in the tracker. In response to the above issue, this paper proposes a token fusion method that speeds up inference while avoiding information loss and thus improving the robustness of the tracker. Specifically, the input of the transformer's encoder contains search tokens and exemplar tokens, and the search tokens are divided into tracking object tokens and background tokens according to the similarity between search tokens and exemplar tokens. The tokens with greater similarity to the exemplar tokens are identified as tracking object tokens, and those with smaller similarity to the exemplar tokens are identified as background tokens. The tracking object tokens contain the discriminative features of the tracking object, for the sake of making the tracker pay more attention to the tracking object tokens while reducing the computational effort. All the tracking object tokens are kept, and then, the background tokens are weighted and fused to form new background tokens according to the attention weight of the background tokens to prevent the loss of contextual information. The token fusion method presented in this paper not only provides efficient inference of the tracker but also makes the tracker more robust. Extensive experiments are carried out on popular tracking benchmark datasets to verify the validity of the token fusion method.
引用
收藏
页码:7043 / 7056
页数:14
相关论文
共 50 条
  • [31] Multilayer feature fusion and saliency-attention object tracking
    Wang, Lichao
    Shang, Yongjian
    Cheng, Qingyang
    Dong, Jiahui
    Geng, Shuqiao
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
  • [32] CATrack: Convolution and Attention Feature Fusion for Visual Object Tracking
    Zhang, Longkun
    Wen, Jiajun
    Dai, Zichen
    Zhou, Rouyi
    Lai, Zhihui
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 469 - 480
  • [33] Hierarchical Feature Pooling Transformer for Efficient UAV Object Tracking
    Wang, Haijun
    Ma, Wenlai
    Zhang, Shengyan
    Hao, Wei
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [34] TSVT: Token Sparsification Vision Transformer for robust RGB-D salient object detection
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    PATTERN RECOGNITION, 2024, 148
  • [35] Robust Visual Tracking with Hierarchical Deep Features Weighted Fusion
    Dianwei Wang
    Chunxiang Xu
    Daxiang Li
    Ying Liu
    Zhijie Xu
    Jing Wang
    JournalofBeijingInstituteofTechnology, 2019, 28 (04) : 770 - 776
  • [36] Robust Visual Tracking with Hierarchical Deep Features Weighted Fusion
    Wang D.
    Xu C.
    Li D.
    Liu Y.
    Xu Z.
    Wang J.
    Journal of Beijing Institute of Technology (English Edition), 2019, 28 (04): : 770 - 776
  • [37] Learning Feature Restoration Transformer for Robust Dehazing Visual Object Tracking
    Xu, Tianyang
    Pan, Yifan
    Feng, Zhenhua
    Zhu, Xuefeng
    Cheng, Chunyang
    Wu, Xiao-Jun
    Kittler, Josef
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6021 - 6038
  • [38] Cross-Drone Transformer Network for Robust Single Object Tracking
    Chen, Guanlin
    Zhu, Pengfei
    Cao, Bing
    Wang, Xing
    Hu, Qinghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4552 - 4563
  • [39] Robust object tracking based on adaptive multicue feature fusion
    Kumar, Ashish
    Walia, Gurjit Singh
    Sharma, Kapil
    JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (06)
  • [40] ADAPTIVE MULTI-FEATURE FUSION FOR ROBUST OBJECT TRACKING
    Liu, Mengxue
    Qi, Yujuan
    Wang, Yanjiang
    Liu, Baodi
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1884 - 1888