Two-stage Unidirectional Fusion Network for RGBT tracking

被引：0

作者：

Liu, Yisong ^{[1
]}

Gao, Zhao ^{[1
]}

Cao, Yang ^{[2
]}

Zhou, Dongming ^{[1
,3
]}

机构：

[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650504, Yunnan, Peoples R China

[2] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 210096, Peoples R China

[3] Hunan Univ Informat Technol, Sch Elect Sci & Engn, Changsha 410100, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 310卷

基金：

中国国家自然科学基金;

关键词：

RGBT object tracking; Prompt learning; Multi-modal fusion; Causal decoder;

D O I：

10.1016/j.knosys.2025.112983

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB and Thermal (RGBT) tracking has recently attracted significant attention for its ability to accurately localize targets in complex scenarios. However, the creation of large-scale RGBT tracking datasets is both resource-intensive and laborious, motivating researchers to develop prompt tuning methods to adapt upstream RGB trackers to multimodal data with minimal additional parameters. Nevertheless, these methods do not fully exploit the supplementary modality information and tend to overlook the dynamic advantages between the two modalities in challenging scenarios. To address these issues, we propose a Two-stage Unidirectional Fusion (TUF) algorithm for RGBT tracking. This approach maximizes knowledge retention from upstream models while effectively leveraging the complementarity between the two modalities. It allows the powerful RGB feature extraction backbone from the upstream model to guide TIR image feature extraction through a two-stage unidirectional fusion strategy. Additionally, we have introduced an autoregressive decoder into RGBT tracking as a replacement for traditional bounding box prediction heads. This streamlines the framework of our RGBT tracker and improves tracking accuracy. Extensive experiments conducted on four widely used RGBT tracking benchmarks validate that our method surpasses existing state-of-the-art prompt tuning approaches, achieving a superior balance between performance and efficiency.

引用

页数：13

共 50 条

[1] Fusion Tree Network for RGBT Tracking
Cheng, Zhiyuan
Lu, Andong
Zhang, Zhang
Li, Chenglong
Wang, Liang
2022 18TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2022), 2022,
[2] RGBT Tracking by Trident Fusion Network
Zhu, Yabin
Li, Chenglong
Tang, Jin
Luo, Bin
Wang, Liang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 579 - 592
[3] Dynamic Fusion Network for RGBT Tracking
Peng, Jingchao
Zhao, Haitao
Hu, Zhengwei
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (04) : 3822 - 3832
[4] Multibranch Adaptive Fusion Network for RGBT Tracking
Li, Yadong
Lai, Huicheng
Wang, Liejun
Jia, Zhenhong
IEEE SENSORS JOURNAL, 2022, 22 (07) : 7084 - 7093
[5] RMFNet: Redetection Multimodal Fusion Network for RGBT Tracking
Zhao, Yanjie
Lai, Huicheng
Gao, Guxue
APPLIED SCIENCES-BASEL, 2023, 13 (09):
[6] Attribute-Based Progressive Fusion Network for RGBT Tracking
Xiao, Yun
Yang, Mengmeng
Li, Chenglong
Liu, Lei
Tang, Jin
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2831 - 2838
[7] HATFNet: Hierarchical adaptive trident fusion network for RGBT tracking
Zhao, Yanjie
Lai, Huicheng
Gao, Guxue
APPLIED INTELLIGENCE, 2023, 53 (20) : 24187 - 24201
[8] HATFNet: Hierarchical adaptive trident fusion network for RGBT tracking
Yanjie Zhao
Huicheng Lai
Guxue Gao
Applied Intelligence, 2023, 53 : 24187 - 24201
[9] Deep Adaptive Fusion Network for High Performance RGBT Tracking
Gao, Yuan
Li, Chenglong
Zhu, Yabin
Tang, Jin
He, Tao
Wang, Futian
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 91 - 99
[10] RGBT dual-modal Siamese tracking network with feature fusion
Shen Y.
Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2021, 50 (03):

← 1 2 3 4 5 →