SiamSEA: Semantic-Aware Enhancement and Associative-Attention Dual-Modal Siamese Network for Robust RGBT Tracking

被引:0
|
作者
Zhuang, Zihan [1 ]
Yin, Mingfeng [1 ]
Gao, Qi [1 ]
Lin, Yong [1 ]
Hong, Xing [1 ]
机构
[1] Jiangsu Univ Technol, Sch Automobile & Traff Engn, Changzhou 213001, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Visualization; Real-time systems; Task analysis; Data mining; Deep learning; Image fusion; RGBT tracking; Siamese network; semantic-aware enhancement; associative-attention; adaptive best score selection;
D O I
10.1109/ACCESS.2024.3442810
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, RGBT tracking methods have been widely applied in visual tracking tasks owing to the complementarity of visible and thermal infrared images. However, in most RGBT trackers, since feature extraction network is not specifically trained for thermal infrared images, the expression of thermal radiation information in tracking task is incomplete. To solve the above problem, a novel RGBT Siamese tracker SiamSEA is proposed to enhance expression of different modal features. Firstly, a semantic-aware enhancement (SE) module is applied to strengthen features in visible images by fusing complementary information. Secondly, for different backgrounds in dual-modal branches, we design an associative-attention mechanism that includes shuffle attention enhancement module (SAE) and channel attention enhancement module (CAE). CAE focuses on the object feature and SAE observes the spatial information, both of which provide accurate features for template matching calculation. Afterwards, dual-modal classification maps and all regression maps are fused in response-level. Finally, the adaptive best score selection module (ABSS) is deployed to flexibly select prediction results in different scenarios. Experimental results on three challenging datasets indicate the effectiveness and robustness of SiamSEA, while it achieves MPR/MSR (%) and tracking speed: GTOT (90.4/73.7, 99.4fps), RGBT234 (77.2/53.8, 72.7fps) and VTUAV (69.7/55.7, 32.3fps).
引用
收藏
页码:134874 / 134887
页数:14
相关论文
共 8 条
  • [1] RGBT dual-modal Siamese tracking network with feature fusion
    Shen Y.
    Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2021, 50 (03):
  • [2] Online classification jointed RGBT tracking based on the dual attention Siamese network
    Zhang Z.
    Tian C.
    Zhou H.
    Tian X.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (06): : 76 - 85
  • [3] Learning Multi-Layer Attention Aggregation Siamese Network for Robust RGBT Tracking
    Feng, Mingzheng
    Su, Jianbo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3378 - 3391
  • [4] Robust object tracking via ensembling semantic-aware network and redetection
    Liu, Peiqiang
    Liang, Qifeng
    An, Zhiyong
    Fu, Jingyi
    Mao, Yanyan
    IET COMPUTER VISION, 2024, 18 (01) : 46 - 59
  • [5] Siamese Tracking Network with Spatial-Semantic-Aware Attention and Flexible Spatiotemporal Constraint
    Zhang, Huanlong
    Wang, Panyun
    Zhang, Jie
    Wang, Fengxian
    Song, Xiaohui
    Zhou, Hebin
    SYMMETRY-BASEL, 2024, 16 (01):
  • [6] Target tracking network based on dual-modal interactive fusion under attention mechanism
    Yao Y.
    Chen Y.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2022, 44 (02): : 410 - 419
  • [7] Robust dual-modal image quality assessment aware deep learning network for traffic targets detection of autonomous vehicles
    Keke Geng
    Ge Dong
    Wenhan Huang
    Multimedia Tools and Applications, 2022, 81 : 6801 - 6826
  • [8] Robust dual-modal image quality assessment aware deep learning network for traffic targets detection of autonomous vehicles
    Geng, Keke
    Dong, Ge
    Huang, Wenhan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (05) : 6801 - 6826