Learning Modality Complementary Features with Mixed Attention Mechanism for RGB-T Tracking

被引:8
|
作者
Luo, Yang [1 ,2 ]
Guo, Xiqing [1 ,2 ]
Dong, Mingtao [3 ]
Yu, Jin [1 ,2 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100040, Peoples R China
[3] Northeastern Univ, Inst Image Recognit & Machine Intelligence, Shenyang 110167, Peoples R China
关键词
multi-modality adaptive fusion; mixed-attention mechanism; RGB-T tracking; NETWORK;
D O I
10.3390/s23146609
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
RGB-T tracking involves the use of images from both visible and thermal modalities. The primary objective is to adaptively leverage the relatively dominant modality in varying conditions to achieve more robust tracking compared to single-modality tracking. An RGB-T tracker based on a mixed-attention mechanism to achieve a complementary fusion of modalities (referred to as MACFT) is proposed in this paper. In the feature extraction stage, we utilize different transformer backbone branches to extract specific and shared information from different modalities. By performing mixed-attention operations in the backbone to enable information interaction and self-enhancement between the template and search images, a robust feature representation is constructed that better understands the high-level semantic features of the target. Then, in the feature fusion stage, a modality shared-specific feature interaction structure was designed based on a mixed-attention mechanism, effectively suppressing low-quality modality noise while enhancing the information from the dominant modality. Evaluation on multiple RGB-T public datasets demonstrates that our proposed tracker outperforms other RGB-T trackers on general evaluation metrics while also being able to adapt to long-term tracking scenarios.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] AMNet: Learning to Align Multi-Modality for RGB-T Tracking
    Zhang, Tianlu
    He, Xiaoyi
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7386 - 7400
  • [2] Attention interaction based RGB-T tracking method
    Wang W.
    Fu F.
    Lei H.
    Tang Z.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (03): : 435 - 444
  • [3] Attention and Pixel Matching in RGB-T Object Tracking
    Li, Da
    Zhang, Yao
    Chen, Min
    Chai, Haoxiang
    MATHEMATICS, 2023, 11 (07)
  • [4] Robust RGB-T Tracking via Adaptive Modality Weight Correlation Filters and Cross-modality Learning
    Zhou, Mingliang
    Zhao, Xinwen
    Luo, Futing
    Luo, Jun
    Pu, Huayan
    Xiang, Tao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (04)
  • [5] Efficient RGB-T Tracking via Cross-Modality Distillation
    Zhang, Tianlu
    Guo, Hongyuan
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5404 - 5413
  • [6] Enhanced Real-Time RGB-T Tracking by Complementary Learners
    Xu, Qingyu
    Kuai, Yangliu
    Yang, Junggang
    Deng, Xinpu
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (10)
  • [7] MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking
    Wang, Xiao
    Shu, Xiujun
    Zhang, Shiliang
    Jiang, Bo
    Wang, Yaowei
    Tian, Yonghong
    Wu, Feng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4335 - 4348
  • [8] RGB-T tracking by modality difference reduction and feature re-selection
    Zhang, Qiang
    Liu, Xueru
    Zhang, Tianlu
    IMAGE AND VISION COMPUTING, 2022, 127
  • [9] Learning a Twofold Siamese Network for RGB-T Object Tracking
    Kuai, Yangliu
    Li, Dongdong
    Qian, Que
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (05)
  • [10] Learning cross-modal interaction for RGB-T tracking
    Xu, Chunyan
    Cui, Zhen
    Wang, Chaoqun
    Zhou, Chuanwei
    Yang, Jian
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)