Sparse Transformer-Based Sequence Generation for Visual Object Tracking

被引:0
|
作者
Tian, Dan [1 ]
Liu, Dong-Xin [2 ]
Wang, Xiao [2 ]
Hao, Ying [2 ]
机构
[1] Shenyang Univ, Sch Intelligent Syst Sci & Engn, Shenyang 110044, Liaoning, Peoples R China
[2] Shenyang Univ, Sch Informat Engn, Shenyang 110044, Liaoning, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Visualization; Target tracking; Decoding; Feature extraction; Attention mechanisms; Object tracking; Training; Interference; Attention mechanism; sequence generation; sparse attention; visual object tracking; vision transformer;
D O I
10.1109/ACCESS.2024.3482468
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In visual object tracking, attention mechanisms can flexibly and efficiently handle complex dependencies and global information, which improves tracking accuracy. However, when dealing with scenarios that contain a large amount of background information or other complex information, its global attention ability can dilute the weight of important information, allocate unnecessary attention to background information, and thus reduce tracking performance. To relieve this problem, this paper proposes a visual object tracking framework based on a sparse transformer. Our tracking framework is a simple encoder-decoder structure that realizes the prediction of the target in an autoregressive manner, eliminating the additional head network and simplifying the tracking architecture. Furthermore, we introduce a Sparse Attention Mechanism (SMA) in the cross-attention layer of the decoder. Unlike traditional attention mechanisms, SMA focuses only on the top K pixel values that are most relevant to the current pixel when calculating attention weights. This allows the model to focus more on key information and improve foreground and background discrimination, resulting in more accurate and robust tracking. We conduct tests on six tracking benchmarks, and the experimental results prove the effectiveness of our method.
引用
收藏
页码:154418 / 154425
页数:8
相关论文
共 50 条
  • [41] Sparse Transformer Visual Tracking Network Based on Second-Order Attention
    Yang, Xiaolin
    Hou, Zhiqiang
    Guo, Fan
    Ma, Sugang
    Yu, Wangsheng
    Yang, Xiaobao
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 571 - 579
  • [42] SeqTrack: Sequence to Sequence Learning for Visual Object Tracking
    Chen, Xin
    Peng, Houwen
    Wang, Dong
    Lu, Huchuan
    Hu, Han
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14572 - 14581
  • [43] Overview of Transformer-Based Visual Segmentation Techniques
    Li, Wen-Sheng
    Zhang, Jing
    Zhuo, Li
    Wu, Xin-Jia
    Yan, Yi
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (12): : 2760 - 2782
  • [44] TransReID: Transformer-based Object Re-Identification
    He, Shuting
    Luo, Hao
    Wang, Pichao
    Wang, Fan
    Li, Hao
    Jiang, Wei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14993 - 15002
  • [45] A Novel Transformer-Based Adaptive Object Detection Method
    Su, Shuzhi
    Chen, Runbin
    Fang, Xianjin
    Zhang, Tian
    ELECTRONICS, 2023, 12 (03)
  • [46] Rethinking Transformer-based Set Prediction for Object Detection
    Sun, Zhiqing
    Cao, Shengcao
    Yang, Yiming
    Kitani, Kris
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3591 - 3600
  • [47] Transformer Union Convolution Network for visual object tracking
    Song, Zhehan
    Chen, Yiming
    Luo, Peng
    Feng, Huajun
    Xu, Zhihai
    Li, Qi
    OPTICS COMMUNICATIONS, 2022, 524
  • [48] Hunt-inspired Transformer for visual object tracking
    Zhang, Zhibin
    Xue, Wanli
    Zhou, Yuxi
    Zhang, Kaihua
    Chen, Shengyong
    PATTERN RECOGNITION, 2024, 156
  • [49] A Transformer-Based Variational Autoencoder for Sentence Generation
    Liu, Danyang
    Liu, Gongshen
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [50] Transformer-based partner dance motion generation
    Wu, Ying
    Wu, Zizhao
    Ji, Chengtao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139