Propagating prior information with transformer for robust visual object tracking

被引:0
|
作者
Wu, Yue [1 ]
Cai, Chengtao [1 ,2 ]
Yeo, Chai Kiat [3 ]
机构
[1] Harbin Engn Univ, Sch Intelligent Sci & Engn, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Key Lab Intelligent Technol & Applicat Marine Equi, Minist Educ, Harbin 150001, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
关键词
Visual object tracking; Siamese network; Transformer; Prior information; VIDEO;
D O I
10.1007/s00530-024-01423-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, the domain of visual object tracking has witnessed considerable advancements with the advent of deep learning methodologies. Siamese-based trackers have been pivotal, establishing a new architecture with a weight-shared backbone. With the inclusion of the transformer, attention mechanism has been exploited to enhance the feature discriminability across successive frames. However, the limited adaptability of many existing trackers to the different tracking scenarios has led to inaccurate target localization. To effectively solve this issue, in this paper, we have integrated a siamese network with the transformer, where the former utilizes ResNet50 as the backbone network to extract the target features, while the latter consists of an encoder and a decoder, where the encoder can effectively utilize global contextual information to obtain the discriminative features. Simultaneously, we employ the decoder to propagate prior information related to the target, which enables the tracker to successfully locate the target in a variety of environments, enhancing the stability and robustness of the tracker. Extensive experiments on four major public datasets, OTB100, UAV123, GOT10k and LaSOText demonstrate the effectiveness of the proposed method. Its performance surpasses many state-of-the-art trackers. Additionally, the proposed tracker can achieve a tracking speed of 60 fps, meeting the requirements for real-time tracking.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A Robust Visual Object Tracking Approach on a Mobile Device
    Mohammed, Abdulmalik Danlami
    Morris, Tim
    INFORMATION AND COMMUNICATION TECHNOLOGY, 2014, 8407 : 190 - 198
  • [22] MATI: Multimodal Adaptive Tracking Integrator for Robust Visual Object Tracking
    Li, Kai
    Cai, Lihua
    He, Guangjian
    Gong, Xun
    SENSORS, 2024, 24 (15)
  • [23] Visual Tracking based on deformable Transformer and spatiotemporal information
    Wu, Ruixu
    Wen, Xianbin
    Yuan, Liming
    Xu, Haixia
    Liu, Yanli
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [24] A robust attention-enhanced network with transformer for visual tracking
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (26) : 40761 - 40782
  • [25] RPformer: A Robust Parallel Transformer for Visual Tracking in Complex Scenes
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [26] A robust attention-enhanced network with transformer for visual tracking
    Fengwei Gu
    Jun Lu
    Chengtao Cai
    Multimedia Tools and Applications, 2023, 82 : 40761 - 40782
  • [27] Robust Visual Tracking based on Deep Spatial Transformer Features
    Zhang, Ximing
    Wang, Mingang
    Wei, Jinkang
    Cui, Can
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 5036 - 5041
  • [28] RTSformer: A Robust Toroidal Transformer With Spatiotemporal Features for Visual Tracking
    Gu, Fengwei
    Lu, Jun
    Cai, Chengtao
    Zhu, Qidan
    Ju, Zhaojie
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2024, 54 (02) : 214 - 225
  • [29] Integration of Texture and Depth Information for Robust Object Tracking
    Lin, Yu-Hang
    Chen, Ju-Chin
    Lin, Kawuu W.
    2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2014, : 170 - 174
  • [30] Robust Object Tracking via Information Theoretic Measures
    Wang, Wei-Ning
    Li, Qi
    Wang, Liang
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2020, 17 (05) : 652 - 666