SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking

被引：11

作者：

Yao, Liangliang ^{[1
]}

Fu, Changhong ^{[1
]}

Li, Sihang ^{[1
]}

Zheng, Guangze ^{[2
]}

Ye, Junjie ^{[1
]}

机构：

[1] Tongji Univ, Sch Mech Engn, Shanghai 201804, Peoples R China

[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年

基金：

上海市自然科学基金; 中国国家自然科学基金;

关键词：

D O I：

10.1109/ICRA48891.2023.10161487

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Vision-based object tracking has boosted extensive autonomous applications for unmanned aerial vehicles (UAVs). However, the dynamic changes in flight maneuver and viewpoint encountered in UAV tracking pose significant difficulties, e.g., aspect ratio change, and scale variation. The conventional cross-correlation operation, while commonly used, has limitations in effectively capturing perceptual similarity and incorporates extraneous background information. To mitigate these limitations, this work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking. The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation and effectively discriminate foreground and background information. Additionally, a saliency adaptation embedding operation dynamically generates tokens based on initial saliency, thereby reducing the computational complexity of the Transformer architecture. Finally, a lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information. The efficacy and robustness of the proposed approach have been thoroughly assessed through experiments on three widely-used UAV tracking benchmarks and real-world scenarios, with results demonstrating its superiority. The source code and demo videos are available at https://github.com/vision4robotics/SGDViT.

引用

页码：3353 / 3359

页数：7

共 50 条

[1] Saliency-Guided Lighting
Lee, Chang Ha
Kim, Youngmin
Varshney, Amitabh
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (02): : 369 - 373
[2] Improving a Vision Indoor Localization System by a Saliency-Guided Detection
Elloumi, Wael
Guissous, Kamel
Chetouani, Aladine
Treuillet, Sylvie
2014 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING CONFERENCE, 2014, : 149 - 152
[3] Saliency-Guided Video Deinterlacing
Trocan, Maria
Coudoux, Francois-Xavier
COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT II, 2015, 9330 : 24 - 33
[4] Saliency-Guided Image Translation
Jiang, Lai
Xu, Mai
Wang, Xiaofei
Sigal, Leonid
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16504 - 16513
[5] Enhancing by Saliency-guided Decolorization
Ancuti, Codruta Orniana
Ancuti, Cosmin
Bekaert, Phillipe
2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 257 - 264
[6] Saliency-guided image translation
Jiang, Lai
Dai, Ning
Xu, Mai
Deng, Xin
Li, Shengxi
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2023, 49 (10): : 2689 - 2698
[7] UAV Image Haze Removal Based on Saliency-Guided Parallel Learning Mechanism
Zheng, Ruohui
Zhang, Libao
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[8] Saliency-guided Pairwise Matching
Huang, Shao
Wang, Weiqiang
PATTERN RECOGNITION LETTERS, 2017, 97 : 37 - 43
[9] GazeFusion: Saliency-Guided Image Generation
Zhang, Yunxiang
Wu, Nan
Lin, Connor Z.
Wetzstein, Gordon
Sun, Qi
ACM TRANSACTIONS ON APPLIED PERCEPTION, 2024, 21 (04)
[10] Saliency-guided compressive fluorescence microscopy
Schwartz, Shimon
Wong, Alexander
Clausi, David A.
2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 4365 - 4368

← 1 2 3 4 5 →