Learning Modality Complementary Features with Mixed Attention Mechanism for RGB-T Tracking

被引：8

作者：

Luo, Yang ^{[1
,2
]}

Guo, Xiqing ^{[1
,2
]}

Dong, Mingtao ^{[3
]}

Yu, Jin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100040, Peoples R China

[3] Northeastern Univ, Inst Image Recognit & Machine Intelligence, Shenyang 110167, Peoples R China

来源：

SENSORS | 2023年 / 23卷 / 14期

关键词：

multi-modality adaptive fusion; mixed-attention mechanism; RGB-T tracking; NETWORK;

D O I：

10.3390/s23146609

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on a mixed-attention mechanism to achieve a complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed-attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, a robust feature representation is constructed that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality shared-specific feature interaction structure was designed based on a mixed-attention mechanism, effectively suppressing low-quality modality noise while enhancing the information from the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to long-term tracking scenarios.

引用

页数：19

共 50 条

[31] Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention
Luan, Tian
Zhang, Hui
Li, Jiafeng
Zhang, Jing
Zhuo, Li
IEEE SENSORS JOURNAL, 2023, 23 (19) : 22930 - 22943
[32] Learning Multiscale Deep Features and SVM Regressors for Adaptive RGB-T Saliency Detection
Ma, Yunpeng
Sun, Dengdi
Meng, Qianqian
Ding, Zhuanlian
Li, Chenglong
2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL. 1, 2017, : 389 - 392
[33] Region Selective Fusion Network for Robust RGB-T Tracking
Yu, Zhencheng
Fan, Huijie
Wang, Qiang
Li, Ziwan
Tang, Yandong
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1357 - 1361
[34] Modal complementary fusion network for RGB-T salient object detection
Ma, Shuai
Song, Kechen
Dong, Hongwen
Tian, Hongkun
Yan, Yunhui
APPLIED INTELLIGENCE, 2023, 53 (08) : 9038 - 9055
[35] Learning Local-Global Multi-Graph Descriptors for RGB-T Object Tracking
Li, Chenglong
Zhu, Chengli
Zhang, Jian
Luo, Bin
Wu, Xiaohao
Tang, Jin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 2913 - 2926
[36] Bridging Search Region Interaction with Template for RGB-T Tracking
Hui, Tianrui
Xun, Zizheng
Peng, Fengguang
Huang, Junshi
Wei, Xiaoming
Wei, Xiaolin
Dai, Jiao
Han, Jizhong
Liu, Si
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13630 - 13639
[37] Modal complementary fusion network for RGB-T salient object detection
Shuai Ma
Kechen Song
Hongwen Dong
Hongkun Tian
Yunhui Yan
Applied Intelligence, 2023, 53 : 9038 - 9055
[38] Residual learning-based two-stream network for RGB-T object tracking
Chen, Yili
Wan, Minjie
Xu, Yunkai
Zhang, Xiaojie
Chen, Qian
Gu, Guohua
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
[39] Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking
Pengyu Zhang
Dong Wang
Huchuan Lu
Xiaoyun Yang
International Journal of Computer Vision, 2021, 129 : 2714 - 2729
[40] A unified RGB-T crowd counting learning framework
Gu, Siqi
Lian, Zhichao
IMAGE AND VISION COMPUTING, 2023, 131

← 1 2 3 4 5 →