Sparse Transformer-Based Sequence Generation for Visual Object Tracking

被引:0
|
作者
Tian, Dan [1 ]
Liu, Dong-Xin [2 ]
Wang, Xiao [2 ]
Hao, Ying [2 ]
机构
[1] Shenyang Univ, Sch Intelligent Syst Sci & Engn, Shenyang 110044, Liaoning, Peoples R China
[2] Shenyang Univ, Sch Informat Engn, Shenyang 110044, Liaoning, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Visualization; Target tracking; Decoding; Feature extraction; Attention mechanisms; Object tracking; Training; Interference; Attention mechanism; sequence generation; sparse attention; visual object tracking; vision transformer;
D O I
10.1109/ACCESS.2024.3482468
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In visual object tracking, attention mechanisms can flexibly and efficiently handle complex dependencies and global information, which improves tracking accuracy. However, when dealing with scenarios that contain a large amount of background information or other complex information, its global attention ability can dilute the weight of important information, allocate unnecessary attention to background information, and thus reduce tracking performance. To relieve this problem, this paper proposes a visual object tracking framework based on a sparse transformer. Our tracking framework is a simple encoder-decoder structure that realizes the prediction of the target in an autoregressive manner, eliminating the additional head network and simplifying the tracking architecture. Furthermore, we introduce a Sparse Attention Mechanism (SMA) in the cross-attention layer of the decoder. Unlike traditional attention mechanisms, SMA focuses only on the top K pixel values that are most relevant to the current pixel when calculating attention weights. This allows the model to focus more on key information and improve foreground and background discrimination, resulting in more accurate and robust tracking. We conduct tests on six tracking benchmarks, and the experimental results prove the effectiveness of our method.
引用
收藏
页码:154418 / 154425
页数:8
相关论文
共 50 条
  • [21] Local to Global: A Sparse Transformer-Based Small Object Detector for Remote Sensing Images
    Li, Zheng
    Wang, Yongcheng
    Feng, Hao
    Chen, Chi
    Xu, Dongdong
    Zhao, Tianqi
    Gao, Yunxiao
    Zhao, Zhikang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [22] Transformer visual object tracking algorithm based on mixed attention
    Hou Z.-Q.
    Guo F.
    Yang X.-L.
    Ma S.-G.
    Fan J.-L.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 739 - 748
  • [23] A Transformer-based visual object tracker via learning immediate appearance change
    Li, Yifan
    Liu, Xiaotao
    Yuan, Dian
    Wang, Jiaoying
    Wu, Peng
    Liu, Jing
    PATTERN RECOGNITION, 2024, 155
  • [24] Transformer-Based Maneuvering Target Tracking
    Zhao, Guanghui
    Wang, Zelin
    Huang, Yixiong
    Zhang, Huirong
    Ma, Xiaojing
    SENSORS, 2022, 22 (21)
  • [25] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE Access, 2020, 8 : 213437 - 213446
  • [26] A Sparse Transformer-Based Approach for Image Captioning
    Lei, Zhou
    Zhou, Congcong
    Chen, Shengbo
    Huang, Yiyong
    Liu, Xianrui
    IEEE ACCESS, 2020, 8 : 213437 - 213446
  • [27] Transformer-Based Visual Segmentation: A Survey
    Li, Xiangtai
    Ding, Henghui
    Yuan, Haobo
    Zhang, Wenwei
    Pang, Jiangmiao
    Cheng, Guangliang
    Chen, Kai
    Liu, Ziwei
    Loy, Chen Change
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10138 - 10163
  • [28] Transformer-based two-source motion model for multi-object tracking
    Yang, Jieming
    Ge, Hongwei
    Su, Shuzhi
    Liu, Guoqing
    APPLIED INTELLIGENCE, 2022, 52 (09) : 9967 - 9979
  • [29] Transformer-based two-source motion model for multi-object tracking
    Jieming Yang
    Hongwei Ge
    Shuzhi Su
    Guoqing Liu
    Applied Intelligence, 2022, 52 : 9967 - 9979
  • [30] AnchorPoint: Query Design for Transformer-Based 3D Object Detection and Tracking
    Liu, Hao
    Ma, Yanni
    Wang, Hanyun
    Zhang, Chaobo
    Guo, Yulan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) : 10988 - 11000