Multi-scale feature extraction and fusion with attention interaction for RGB-T

被引:1
|
作者
Xing, Haijiao [1 ]
Wei, Wei [1 ]
Zhang, Lei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Single-object tracking; RGB-T tracking; Feature fusion; SIAMESE NETWORKS; TRACKING;
D O I
10.1016/j.patcog.2024.110917
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T single-object tracking aims to track objects utilizing both RGB images and thermal infrared(TIR) images. Though the siamese-based RGB-T tracker shows its advantage in tracking speed, its accuracy still cannot be compared with other state-of-the-art trackers (e.g., MDNet). In this study, we revisit the existing siamese-based RGB-T tracker and find that such fall behind comes from insufficient feature fusion between RGB image and TIR image, as well as incomplete interactions between template frame and search frame. Inspired by this, we propose a multi-scale feature extraction and fusion network with Temporal-Spatial Memory (MFATrack). Instead of fusing RGB image and TIR image with the single-scale feature map or only high-level features from the multi-scale feature map, MFATrack proposes a new fusion strategy by fusing features from all scales, which can capture contextual information in shallow layers and details in the deep layer. To learn the feature better for tracking tasks, MFATrack fuses the features via several consecutive frames. In addition, we also propose a self-attention interaction module specifically designed for the search frame, highlighting the features in the search frame that are relevant to the target and thus facilitating rapid convergence for target localization. Experimental results demonstrate the proposed MFATrack is not only fast, but also can obtain better tracking accuracy compared with other competing methods including MDNet-based methods and other siamese-based trackers.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Hierarchical Feature Fusion With Text Attention For Multi-scale Text Detection
    Liu, Chao
    Zou, Yuexian
    Guan, Wenjie
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [32] Adaptive feature fusion with attention mechanism for multi-scale target detection
    Ju, Moran
    Luo, Jiangning
    Wang, Zhongbo
    Luo, Haibo
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2769 - 2781
  • [33] Multi-Scale Feature Fusion Network with Attention for Single Image Dehazing
    Pattern Recognition and Image Analysis, 2021, 31 : 608 - 615
  • [34] Integrating attention mechanism and multi-scale feature extraction for fall detection
    Chen, Hao
    Gu, Wenye
    Zhang, Qiong
    Li, Xiujing
    Jiang, Xiaojing
    HELIYON, 2024, 10 (10)
  • [35] Multi-Scale Feature Extraction Method of Hyperspectral Image with Attention Mechanism
    Xu Zhangchi
    Guo Baofeng
    Wu Wenhao
    You Jingyun
    Su Xiaotong
    LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (04)
  • [36] MFCNet: Multimodal Feature Fusion Network for RGB-T Vehicle Density Estimation
    Qin, Ling-Xiao
    Sun, Hong-Mei
    Duan, Xiao-Meng
    Che, Cheng-Yue
    Jia, Rui-Sheng
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (04): : 4207 - 4219
  • [37] A multi-scale feature extraction fusion model for human activity recognition
    Zhang, Chuanlin
    Cao, Kai
    Lu, Limeng
    Deng, Tao
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [38] A multi-scale feature extraction fusion model for human activity recognition
    Chuanlin Zhang
    Kai Cao
    Limeng Lu
    Tao Deng
    Scientific Reports, 12
  • [39] Lightweight road extraction model based on multi-scale feature fusion
    Liu Y.
    Chen Y.
    Gao L.
    Hong J.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (05): : 951 - 959
  • [40] Co-Saliency Detection Based on Multi-Scale Feature Extraction and Feature Fusion
    Zuo, Kuangji
    Liang, Huiqing
    Wang, Dechen
    Zhang, Dehua
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 364 - 368