Exploring reliable infrared object tracking with spatio-temporal fusion transformer

被引:2
|
作者
Qi, Meibin [1 ]
Wang, Qinxin [1 ]
Zhuang, Shuo [1 ]
Zhang, Ke [2 ]
Li, Kunyuan [1 ]
Liu, Yimin [1 ]
Yang, Yanfang [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Anhui NARI Jiyuan Power Grid Technol Co Ltd, Hefei, Peoples R China
[3] Hefei Univ Technol, Sch Phys, Hefei, Peoples R China
基金
安徽省自然科学基金;
关键词
Thermal infrared object tracking; Spatio-temporal information; Salient points; Target state estimation; NETWORKS;
D O I
10.1016/j.knosys.2023.111234
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Currently, most thermal infrared (TIR) trackers rely on feature matching between the search image and a fixed template cropped from the first frame. Some Siam-based TIR trackers with a template update mechanism introduce historical prediction information in the temporal dimension through correlation filters. However, their feature characterization capability is inadequate to resist target scale variations, appearance changes, and occlusion. To address this challenge, we explore a novel spatio-temporal fusion Transformer (STFT) model to realize robust TIR object tracking. Our approach involves a Transformer-based encoder-decoder that fuses spatio-temporal information. Specifically, we design a dynamic template update strategy based on salient points feature(SPF) representation, which allows the model to leverage the most powerful spatio-temporal information by retrieving multiple salient points on the target image. To further fortify the dynamic template update strategy, we propose an IoU-Aware target state estimation head that utilizes the joint representation of target classification and localization. An IoU-Aware criterion is developed for quality estimation of the dynamic template. The proposed STFT-Net approach has been put to the evaluation on three challenging benchmarks, with extensive experimental results showcasing its superior performance in contrast to acclaimed tracking algorithms. The code is available at https://github.com/qinxin-wh/STFT-Net.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Spatio-temporal hierarchical feature transformer for UAV object tracking
    Zhu, Fuzhen
    Cui, Jingyi
    Dou, Kaiqi
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 204 : 442 - 452
  • [2] Memory Prompt for Spatio-Temporal Transformer Visual Object Tracking
    Xu T.
    Wu X.
    Zhu X.
    Kittler J.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 1 - 6
  • [3] ViT Spatio-Temporal Feature Fusion for Aerial Object Tracking
    Guo, Chuangye
    Liu, Kang
    Deng, Donghu
    Li, Xuelong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 6749 - 6761
  • [4] STMT: Spatio-temporal memory transformer for multi-object tracking
    Gu, Songbo
    Ma, Jianxin
    Hui, Guancheng
    Xiao, Qiyang
    Shi, Wentao
    APPLIED INTELLIGENCE, 2023, 53 (20) : 23426 - 23441
  • [5] STMT: Spatio-temporal memory transformer for multi-object tracking
    Songbo Gu
    Jianxin Ma
    Guancheng Hui
    Qiyang Xiao
    Wentao Shi
    Applied Intelligence, 2023, 53 : 23426 - 23441
  • [6] Spatio-temporal interactive fusion based visual object tracking method
    Huang, Dandan
    Yu, Siyu
    Duan, Jin
    Wang, Yingzhi
    Yao, Anni
    Wang, Yiwen
    Xi, Junhan
    FRONTIERS IN PHYSICS, 2023, 11
  • [7] Learning Spatio-Temporal Transformer for Visual Tracking
    Yan, Bin
    Peng, Houwen
    Fu, Jianlong
    Wang, Dong
    Lu, Huchuan
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10428 - 10437
  • [8] Transformer RGBT Tracking With Spatio-Temporal Multimodal Tokens
    Sun, Dengdi
    Pan, Yajie
    Lu, Andong
    Li, Chenglong
    Luo, Bin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 12059 - 12072
  • [9] Spatio-Temporal Point Process for Multiple Object Tracking
    Wang, Tao
    Chen, Kean
    Lin, Weiyao
    See, John
    Zhang, Zenghui
    Xu, Qian
    Jia, Xia
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (04) : 1777 - 1788
  • [10] SPATIO-TEMPORAL CORRELATION LEARNING FOR MULTIPLE OBJECT TRACKING
    Jian, Yajun
    Zhuang, Chihui
    He, Wenyan
    Du, Kaiwen
    Lu, Yang
    Wang, Hanzi
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6170 - 6174