Learning Modality Complementary Features with Mixed Attention Mechanism for RGB-T Tracking

被引：8

作者：

Luo, Yang ^{[1
,2
]}

Guo, Xiqing ^{[1
,2
]}

Dong, Mingtao ^{[3
]}

Yu, Jin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100040, Peoples R China

[3] Northeastern Univ, Inst Image Recognit & Machine Intelligence, Shenyang 110167, Peoples R China

来源：

SENSORS | 2023年 / 23卷 / 14期

关键词：

multi-modality adaptive fusion; mixed-attention mechanism; RGB-T tracking; NETWORK;

D O I：

10.3390/s23146609

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on a mixed-attention mechanism to achieve a complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed-attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, a robust feature representation is constructed that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality shared-specific feature interaction structure was designed based on a mixed-attention mechanism, effectively suppressing low-quality modality noise while enhancing the information from the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to long-term tracking scenarios.

引用

页数：19

共 50 条

[1] AMNet: Learning to Align Multi-Modality for RGB-T Tracking
Zhang, Tianlu
He, Xiaoyi
Jiao, Qiang
Zhang, Qiang
Han, Jungong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7386 - 7400
[2] Attention interaction based RGB-T tracking method
Wang W.
Fu F.
Lei H.
Tang Z.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (03): : 435 - 444
[3] Attention and Pixel Matching in RGB-T Object Tracking
Li, Da
Zhang, Yao
Chen, Min
Chai, Haoxiang
MATHEMATICS, 2023, 11 (07)
[4] Robust RGB-T Tracking via Adaptive Modality Weight Correlation Filters and Cross-modality Learning
Zhou, Mingliang
Zhao, Xinwen
Luo, Futing
Luo, Jun
Pu, Huayan
Xiang, Tao
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (04)
[5] Efficient RGB-T Tracking via Cross-Modality Distillation
Zhang, Tianlu
Guo, Hongyuan
Jiao, Qiang
Zhang, Qiang
Han, Jungong
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5404 - 5413
[6] Enhanced Real-Time RGB-T Tracking by Complementary Learners
Xu, Qingyu
Kuai, Yangliu
Yang, Junggang
Deng, Xinpu
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (10)
[7] MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking
Wang, Xiao
Shu, Xiujun
Zhang, Shiliang
Jiang, Bo
Wang, Yaowei
Tian, Yonghong
Wu, Feng
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4335 - 4348
[8] RGB-T tracking by modality difference reduction and feature re-selection
Zhang, Qiang
Liu, Xueru
Zhang, Tianlu
IMAGE AND VISION COMPUTING, 2022, 127
[9] Learning a Twofold Siamese Network for RGB-T Object Tracking
Kuai, Yangliu
Li, Dongdong
Qian, Que
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (05)
[10] Learning cross-modal interaction for RGB-T tracking
Xu, Chunyan
Cui, Zhen
Wang, Chaoqun
Zhou, Chuanwei
Yang, Jian
SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)

← 1 2 3 4 5 →