SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking

被引:11
|
作者
Yao, Liangliang [1 ]
Fu, Changhong [1 ]
Li, Sihang [1 ]
Zheng, Guangze [2 ]
Ye, Junjie [1 ]
机构
[1] Tongji Univ, Sch Mech Engn, Shanghai 201804, Peoples R China
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
D O I
10.1109/ICRA48891.2023.10161487
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Vision-based object tracking has boosted extensive autonomous applications for unmanned aerial vehicles (UAVs). However, the dynamic changes in flight maneuver and viewpoint encountered in UAV tracking pose significant difficulties, e.g., aspect ratio change, and scale variation. The conventional cross-correlation operation, while commonly used, has limitations in effectively capturing perceptual similarity and incorporates extraneous background information. To mitigate these limitations, this work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking. The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation and effectively discriminate foreground and background information. Additionally, a saliency adaptation embedding operation dynamically generates tokens based on initial saliency, thereby reducing the computational complexity of the Transformer architecture. Finally, a lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information. The efficacy and robustness of the proposed approach have been thoroughly assessed through experiments on three widely-used UAV tracking benchmarks and real-world scenarios, with results demonstrating its superiority. The source code and demo videos are available at https://github.com/vision4robotics/SGDViT.
引用
收藏
页码:3353 / 3359
页数:7
相关论文
共 50 条
  • [1] Saliency-Guided Lighting
    Lee, Chang Ha
    Kim, Youngmin
    Varshney, Amitabh
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (02): : 369 - 373
  • [2] Improving a Vision Indoor Localization System by a Saliency-Guided Detection
    Elloumi, Wael
    Guissous, Kamel
    Chetouani, Aladine
    Treuillet, Sylvie
    2014 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING CONFERENCE, 2014, : 149 - 152
  • [3] Saliency-Guided Video Deinterlacing
    Trocan, Maria
    Coudoux, Francois-Xavier
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT II, 2015, 9330 : 24 - 33
  • [4] Saliency-Guided Image Translation
    Jiang, Lai
    Xu, Mai
    Wang, Xiaofei
    Sigal, Leonid
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16504 - 16513
  • [5] Enhancing by Saliency-guided Decolorization
    Ancuti, Codruta Orniana
    Ancuti, Cosmin
    Bekaert, Phillipe
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 257 - 264
  • [6] Saliency-guided image translation
    Jiang, Lai
    Dai, Ning
    Xu, Mai
    Deng, Xin
    Li, Shengxi
    Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (10): : 2689 - 2698
  • [7] UAV Image Haze Removal Based on Saliency-Guided Parallel Learning Mechanism
    Zheng, Ruohui
    Zhang, Libao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [8] Saliency-guided Pairwise Matching
    Huang, Shao
    Wang, Weiqiang
    PATTERN RECOGNITION LETTERS, 2017, 97 : 37 - 43
  • [9] GazeFusion: Saliency-Guided Image Generation
    Zhang, Yunxiang
    Wu, Nan
    Lin, Connor Z.
    Wetzstein, Gordon
    Sun, Qi
    ACM TRANSACTIONS ON APPLIED PERCEPTION, 2024, 21 (04)
  • [10] Saliency-guided compressive fluorescence microscopy
    Schwartz, Shimon
    Wong, Alexander
    Clausi, David A.
    2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 4365 - 4368