Learning Feature Restoration Transformer for Robust Dehazing Visual Object Tracking

被引：0

作者：

Xu, Tianyang ^{[1
]}

Pan, Yifan ^{[1
]}

Feng, Zhenhua ^{[2
,3
]}

Zhu, Xuefeng ^{[1
]}

Cheng, Chunyang ^{[1
]}

Wu, Xiao-Jun ^{[1
]}

Kittler, Josef ^{[2
,3
]}

机构：

[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Peoples R China

[2] Univ Surrey, Sch Comp Sci & Elect Engn, Guildford GU2 7XH, England

[3] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford GU27XH, England

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2024年 / 132卷 / 12期

基金：

英国工程与自然科学研究理事会; 中国国家自然科学基金;

关键词：

Visual object tracking; Dehazing system; Siamese tracker; Feature restoration;

D O I：

10.1007/s11263-024-02182-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep-learning-based visual object tracking has obtained promising results. However, a drastic performance drop is observed when transferring a pre-trained model to changing weather conditions, such as hazy imaging scenarios, where the data distribution differs from that of a natural training set. This problem challenges the open-world practical applications of accurate target tracking. In principle, visual tracking performance relies on the discriminative degree of features between the target and its surroundings, rather than the image-level visual quality. To this end, we design a feature restoration transformer that adaptively enhances the representation capability of the extracted visual features for robust tracking in both natural and hazy scenarios. Specifically, a feature restoration transformer is constructed with dedicated self-attention hierarchies for the refinement of potentially contaminated deep feature maps. We endow the feature extraction process with a refinement mechanism typically for hazy imaging scenarios, establishing a tracking system that is robust against foggy videos. In essence, the feature restoration transformer is jointly trained with a Siamese tracking transformer. Intuitively, the supervision for learning discriminative and salient features is facilitated by the entire restoration tracking system. The experimental results obtained on hazy imaging scenarios demonstrate the merits and superiority of the proposed restoration tracking system, with complementary restoration power to image-level dehazing. In addition, consistent advantages of our design can be observed when generalised to different video attributes, demonstrating its capacity to deal with open-world scenarios.

引用

页码：6021 / 6038

页数：18

共 50 条

[1] Spatial feature embedding for robust visual object tracking
Liu, Kang
Liu, Long
Yang, Shangqi
Fu, Zhihao
IET COMPUTER VISION, 2024, 18 (04) : 540 - 556
[2] Propagating prior information with transformer for robust visual object tracking
Wu, Yue
Cai, Chengtao
Yeo, Chai Kiat
MULTIMEDIA SYSTEMS, 2024, 30 (05)
[3] Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking
Xu, Tianyang
Feng, Zhen-Hua
Wu, Xiao-Jun
Kittler, Josef
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7949 - 7959
[4] Dense convolutional feature histograms for robust visual object tracking
Nousi, Paraskevi
Tefas, Anastasios
Pitas, Ioannis
IMAGE AND VISION COMPUTING, 2020, 99
[5] FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking
Liu, Hang
Huang, Detian
Lin, Mingxin
APPLIED SCIENCES-BASEL, 2024, 14 (22):
[6] Transformer-Based Visual Object Tracking with Global Feature Enhancement
Wang, Shuai
Fang, Genwen
Liu, Lei
Wang, Jun
Zhu, Kongfen
Melo, Silas N.
APPLIED SCIENCES-BASEL, 2023, 13 (23):
[7] Learning object intrinsic structure for robust visual tracking
Wang, Q
Xu, GY
Ai, HZ
2003 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL II, PROCEEDINGS, 2003, : 227 - 233
[8] Learning Spatial-Frequency Transformer for Visual Object Tracking
Tang, Chuanming
Wang, Xiao
Bai, Yuanchao
Wu, Zhe
Zhang, Jianlin
Huang, Yongmei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5102 - 5116
[9] Joint Feature Correspondences and Appearance Similarity for Robust Visual Object Tracking
Khan, Zulfiqar Hasan
Gu, Irene Yu-Hua
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2010, 5 (03) : 591 - 606
[10] Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking
Fiaz, Mustansar
Mahmood, Arif
Jung, Soon Ki
SENSORS, 2020, 20 (14) : 1 - 25

← 1 2 3 4 5 →