Multiple templates transformer for visual object tracking

被引:7
|
作者
Pang, Haibo [1 ]
Su, Jie [1 ]
Ma, Rongqi [1 ]
Li, Tingting [1 ]
Liu, Chengming [1 ]
机构
[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou 450002, Peoples R China
关键词
Single object tracking; Siamese tracker; Multiple templates; Transformer;
D O I
10.1016/j.knosys.2023.111025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Matching the similarity between a template and search region is crucial in Siamese trackers. However, due to the limited information provided by a fixed template, existing trackers are not robust enough in complex scenarios, such as severe deformation, background clutters, out-of-view, illumination variation, low resolution, scale variation, fast motion, and full occlusion. Therefore, it is essential to use an informative template. Additionally, since the Transformer has superior model capability compared to traditional cross-correlation in tracking, some Siamese trackers have integrated Transformers and achieved exceptional performance. In this paper, we present a novel tracking architecture with Multiple Templates Transformer (MTT) to address the above issues. By utilizing multiple templates, the proposed method can grasp more contextual information and historical changes about the target, which can be leveraged to enhance the response in the search region using an encoder-decoder framework. We also explore different mechanisms to fuse templates effectively to achieve higher accuracy. We evaluate MTT in several famous benchmarks such as GOT-10k, TrackingNet, UAV123, OTB2015, VOT2018, and LaSOT. Extensive experimental results indicate that our tracker is capable of achieving better robustness in the face of different challenges while maintaining a considerable real-time speed.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Transformer Union Convolution Network for visual object tracking
    Song, Zhehan
    Chen, Yiming
    Luo, Peng
    Feng, Huajun
    Xu, Zhihai
    Li, Qi
    OPTICS COMMUNICATIONS, 2022, 524
  • [2] AwareTrack: Object awareness for visual tracking via templates interaction
    Zhang, Hong
    Song, Jianbo
    Liu, Hanyang
    Han, Yang
    Yang, Yifan
    Ma, Huimin
    IMAGE AND VISION COMPUTING, 2025, 154
  • [3] Hunt-inspired Transformer for visual object tracking
    Zhang, Zhibin
    Xue, Wanli
    Zhou, Yuxi
    Zhang, Kaihua
    Chen, Shengyong
    PATTERN RECOGNITION, 2024, 156
  • [4] METFormer: A Motion Enhanced Transformer for Multiple Object Tracking
    Gao, Jianjun
    Yap, Kim-Hui
    Wang, Yi
    Garg, Kratika
    Han, Boon Siew
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [5] Transformer for multiple object tracking: Exploring locality to vision
    Wu, Shan
    Hadachi, Amnir
    Lu, Chaoru
    Vivet, Damien
    PATTERN RECOGNITION LETTERS, 2023, 170 : 70 - 76
  • [6] SPPT: Siamese Pyramid Pooling Transformer for Visual Object Tracking
    Fang, Yang
    Xie, Bailian
    Jiang, Bingbing
    Ke, Xuhui
    Li, Yan
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2023, 13
  • [7] Learning Spatial-Frequency Transformer for Visual Object Tracking
    Tang, Chuanming
    Wang, Xiao
    Bai, Yuanchao
    Wu, Zhe
    Zhang, Jianlin
    Huang, Yongmei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5102 - 5116
  • [8] Propagating prior information with transformer for robust visual object tracking
    Wu, Yue
    Cai, Chengtao
    Yeo, Chai Kiat
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [9] Transformer visual object tracking algorithm based on mixed attention
    Hou Z.-Q.
    Guo F.
    Yang X.-L.
    Ma S.-G.
    Fan J.-L.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 739 - 748
  • [10] Use of multiple visual features for object tracking
    Pasqual, AA
    Aizawa, K
    Hatori, M
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '99, PARTS 1-2, 1998, 3653 : 946 - 955