SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking

被引:11
|
作者
Yao, Liangliang [1 ]
Fu, Changhong [1 ]
Li, Sihang [1 ]
Zheng, Guangze [2 ]
Ye, Junjie [1 ]
机构
[1] Tongji Univ, Sch Mech Engn, Shanghai 201804, Peoples R China
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
D O I
10.1109/ICRA48891.2023.10161487
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Vision-based object tracking has boosted extensive autonomous applications for unmanned aerial vehicles (UAVs). However, the dynamic changes in flight maneuver and viewpoint encountered in UAV tracking pose significant difficulties, e.g., aspect ratio change, and scale variation. The conventional cross-correlation operation, while commonly used, has limitations in effectively capturing perceptual similarity and incorporates extraneous background information. To mitigate these limitations, this work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking. The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation and effectively discriminate foreground and background information. Additionally, a saliency adaptation embedding operation dynamically generates tokens based on initial saliency, thereby reducing the computational complexity of the Transformer architecture. Finally, a lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information. The efficacy and robustness of the proposed approach have been thoroughly assessed through experiments on three widely-used UAV tracking benchmarks and real-world scenarios, with results demonstrating its superiority. The source code and demo videos are available at https://github.com/vision4robotics/SGDViT.
引用
收藏
页码:3353 / 3359
页数:7
相关论文
共 50 条
  • [11] Saliency-guided enhancement for volume visualization
    Kim, Youngmin
    Varshney, Amitabh
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2006, 12 (05) : 925 - 932
  • [12] Saliency-Guided Integration of Multiple Scans
    Song, Ran
    Liu, Yonghuai
    Martin, Ralph R.
    Rosin, Paul L.
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 1474 - 1481
  • [13] SALIENCY-GUIDED IMAGE STYLE TRANSFER
    Liu, Xiuwen
    Liu, Zhi
    Zhou, Xiaofei
    Chen, Minyu
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 66 - 71
  • [14] Saliency-Guided Consistent Color Harmonization
    Baveye, Yoann
    Urban, Fabrice
    Chamaret, Christel
    Demoulin, Vincent
    Hellier, Pierre
    COMPUTATIONAL COLOR IMAGING, CCIW 2013, 2013, 7786 : 105 - 118
  • [15] SSiT: Saliency-Guided Self-Supervised Image Transformer for Diabetic Retinopathy Grading
    Huang, Yijin
    Lyu, Junyan
    Cheng, Pujin
    Tam, Roger
    Tang, Xiaoying
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (05) : 2806 - 2817
  • [16] SageMix: Saliency-Guided Mixup for Point Clouds
    Lee, Sanghyeok
    Jeon, Minkyu
    Kim, Injae
    Xiong, Yunyang
    Kim, Hyunwoo J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [17] Saliency-Guided Complexity Control for HEVC Decoding
    Yang, Ren
    Xu, Mai
    Wang, Zulin
    Duan, Yiping
    Tao, Xiaoming
    IEEE TRANSACTIONS ON BROADCASTING, 2018, 64 (04) : 865 - 882
  • [18] Saliency-Guided Color Transfer between Images
    Xia, Jiazhi
    ADVANCES IN VISUAL COMPUTING, ISVC 2013, PT I, 2013, 8033 : 468 - 475
  • [19] STN: Saliency-Guided Transformer Network for Point-Wise Semantic Segmentation of Urban Scenes
    Ma, Lingfei
    Li, Jonathan
    Guan, Haiyan
    Yu, Yongtao
    Chen, Yiping
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [20] Saliency-guided Adaptive Seeding for Supervoxel Segmentation
    Gao, Ge
    Lauri, Mikko
    Zhang, Jianwei
    Frintrop, Simone
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4938 - 4943