Learning Modality Complementary Features with Mixed Attention Mechanism for RGB-T Tracking

被引：8

作者：

Luo, Yang ^{[1
,2
]}

Guo, Xiqing ^{[1
,2
]}

Dong, Mingtao ^{[3
]}

Yu, Jin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100040, Peoples R China

[3] Northeastern Univ, Inst Image Recognit & Machine Intelligence, Shenyang 110167, Peoples R China

来源：

SENSORS | 2023年 / 23卷 / 14期

关键词：

multi-modality adaptive fusion; mixed-attention mechanism; RGB-T tracking; NETWORK;

D O I：

10.3390/s23146609

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on a mixed-attention mechanism to achieve a complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed-attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, a robust feature representation is constructed that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality shared-specific feature interaction structure was designed based on a mixed-attention mechanism, effectively suppressing low-quality modality noise while enhancing the information from the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to long-term tracking scenarios.

引用

页数：19

共 50 条

[21] RGB-T tracking with frequency hybrid awareness
Lei, Lei
Li, Xianxian
IMAGE AND VISION COMPUTING, 2024, 152
[22] Toward Modalities Correlation for RGB-T Tracking
Hu, Xiantao
Zhong, Bineng
Liang, Qihua
Zhang, Shengping
Li, Ning
Li, Xianxian
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9102 - 9111
[23] Two-stage modality-graphs regularized manifold ranking for RGB-T tracking
Li, Chenglong
Zhu, Chengli
Zheng, Shaofei
Luo, Bin
Tang, Jing
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 68 : 207 - 217
[24] Robust RGB-T Tracking via Graph Attention-Based Bilinear Pooling
Kang, Bin
Liang, Dong
Mei, Junxi
Tan, Xiaoyang
Zhou, Quan
Zhang, Dengyin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) : 9900 - 9911
[25] Learning Multi-domain Convolutional Network for RGB-T Visual Tracking
Zhang, Xingming
Zhang, Xuehan
Du, Xuedan
Zhou, Xiangming
Yin, Jun
2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
[26] Learning Soft-Consistent Correlation Filters for RGB-T Object Tracking
Wang, Yulong
Li, Chenglong
Tang, Jin
PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT IV, 2018, 11259 : 295 - 306
[27] Enabling modality interactions for RGB-T salient object detection
Zhang, Qiang
Xi, Ruida
Xiao, Tonglin
Huang, Nianchang
Luo, Yongjiang
COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 222
[28] Weighted Sparse Representation Regularized Graph Learning for RGB-T Object Tracking
Li, Chenglong
Zhao, Nan
Lu, Yijuan
Zhu, Chengli
Tang, Jin
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1856 - 1864
[29] Multi-modal adapter for RGB-T tracking
Wang, He
Xu, Tianyang
Tang, Zhangyong
Wu, Xiao-Jun
Kittler, Josef
INFORMATION FUSION, 2025, 118
[30] Multiscale Modality-Similar Learning Guided Weakly Supervised RGB-T Crowd Counting
Kong, Weihang
Li, He
Zhao, Fengda
IEEE SENSORS JOURNAL, 2024, 24 (18) : 29121 - 29134

← 1 2 3 4 5 →