Multiple templates transformer for visual object tracking

被引：7

作者：

Pang, Haibo ^{[1
]}

Su, Jie ^{[1
]}

Ma, Rongqi ^{[1
]}

Li, Tingting ^{[1
]}

Liu, Chengming ^{[1
]}

机构：

[1] Zhengzhou Univ, Sch Cyber Sci & Engn, Zhengzhou 450002, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2023年 / 280卷

关键词：

Single object tracking; Siamese tracker; Multiple templates; Transformer;

D O I：

10.1016/j.knosys.2023.111025

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Matching the similarity between a template and search region is crucial in Siamese trackers. However, due to the limited information provided by a fixed template, existing trackers are not robust enough in complex scenarios, such as severe deformation, background clutters, out-of-view, illumination variation, low resolution, scale variation, fast motion, and full occlusion. Therefore, it is essential to use an informative template. Additionally, since the Transformer has superior model capability compared to traditional cross-correlation in tracking, some Siamese trackers have integrated Transformers and achieved exceptional performance. In this paper, we present a novel tracking architecture with Multiple Templates Transformer (MTT) to address the above issues. By utilizing multiple templates, the proposed method can grasp more contextual information and historical changes about the target, which can be leveraged to enhance the response in the search region using an encoder-decoder framework. We also explore different mechanisms to fuse templates effectively to achieve higher accuracy. We evaluate MTT in several famous benchmarks such as GOT-10k, TrackingNet, UAV123, OTB2015, VOT2018, and LaSOT. Extensive experimental results indicate that our tracker is capable of achieving better robustness in the face of different challenges while maintaining a considerable real-time speed.

引用

页数：13

共 50 条

[11] Visual Attention Is Required for Multiple Object Tracking
Tran, Annie
Hoffman, James E.
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2016, 42 (12) : 2103 - 2114
[12] Visual Learning in Multiple-Object Tracking
Makovski, Tal
Vazquez, Gustavo A.
Jiang, Yuhong V.
PLOS ONE, 2008, 3 (05):
[13] Learning Feature Restoration Transformer for Robust Dehazing Visual Object Tracking
Xu, Tianyang
Pan, Yifan
Feng, Zhenhua
Zhu, Xuefeng
Cheng, Chunyang
Wu, Xiao-Jun
Kittler, Josef
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6021 - 6038
[14] Foreground-Background Distribution Modeling Transformer for Visual Object Tracking
Yang, Dawei
He, Jianfeng
Ma, Yinchao
Yu, Qianjin
Zhang, Tianzhu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10083 - 10093
[15] Sparse Transformer-Based Sequence Generation for Visual Object Tracking
Tian, Dan
Liu, Dong-Xin
Wang, Xiao
Hao, Ying
IEEE ACCESS, 2024, 12 : 154418 - 154425
[16] FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking
Liu, Hang
Huang, Detian
Lin, Mingxin
APPLIED SCIENCES-BASEL, 2024, 14 (22):
[17] Memory Prompt for Spatio-Temporal Transformer Visual Object Tracking
Xu T.
Wu X.
Zhu X.
Kittler J.
IEEE Transactions on Artificial Intelligence, 2024, 5 (08): : 1 - 6
[18] Transformer-Based Visual Object Tracking with Global Feature Enhancement
Wang, Shuai
Fang, Genwen
Liu, Lei
Wang, Jun
Zhu, Kongfen
Melo, Silas N.
APPLIED SCIENCES-BASEL, 2023, 13 (23):
[19] The effect of visual distinctiveness on multiple object tracking performance
Howe, Piers D. L.
Holcombe, Alex O.
FRONTIERS IN PSYCHOLOGY, 2012, 3
[20] Online learning of multiple detectors for visual object tracking
Quan, Wei
Chen, Jin-Xiong
Yu, Nan-Yang
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2014, 42 (05): : 875 - 882

← 1 2 3 4 5 →