Transformer-Based Cross-Modal Integration Network for RGB-T Salient Object Detection

被引:2
|
作者
Lv, Chengtao [1 ]
Zhou, Xiaofei [1 ]
Wan, Bin [1 ]
Wang, Shuai [2 ,3 ]
Sun, Yaoqi [1 ,3 ]
Zhang, Jiyong [1 ]
Yan, Chenggang [2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China
[2] Sch Commun Engn, Hangzhou Dianzi Univ, Hangzhou 310018, Peoples R China
[3] Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Transformers; Semantics; Decoding; Aggregates; Object detection; Fuses; Salient object detection; collaborative spatial attention; feature interaction; Swin transformer; interactive complement; IMAGE; KERNEL;
D O I
10.1109/TCE.2024.3390841
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Salient object detection (SOD) can be applied to consumer electronic area, which can help to identify and locate objects of interest. RGB/RGB-D (depth) salient object detection has achieved great progress in recent years. However, there is a large room for improvement in exploring the complementarity of two-modal information for RGB-T (thermal) SOD. Therefore, this paper proposes a Transformer-based Cross-modal Integration Network (i.e., TCINet) to detect salient objects in RGB-T images, which can properly fuse two-modal features and interactively aggregate two-level features. Our method consists of the siamese Swin Transformer-based encoders, the cross-modal feature fusion (CFF) module, and the interaction-based feature decoding (IFD) block. Here, the CFF module is designed to fuse the complementary information of two-modal features, where the collaborative spatial attention emphasizes salient regions and suppresses background regions of the two-modal features. Furthermore, we deploy the IFD block to aggregate two-level features, including the previous-level fused feature and the current-level encoder feature, where the IFD block bridges the large semantic gap and reduces the noise. Extensive experiments are conducted on three RGB-T datasets, and the experimental results clearly demonstrate the superiority and effectiveness of our method when compared with the cutting-edge saliency methods. The results and code of our method will be available at https://github.com/lvchengtao/TCINet.
引用
收藏
页码:4741 / 4755
页数:15
相关论文
共 50 条
  • [31] CFRNet: Cross-Attention-Based Fusion and Refinement Network for Enhanced RGB-T Salient Object Detection
    Deng, Biao
    Liu, Di
    Cao, Yang
    Liu, Hong
    Yan, Zhiguo
    Chen, Hu
    SENSORS, 2024, 24 (22)
  • [32] Learning cross-modal interaction for RGB-T tracking
    Xu, Chunyan
    Cui, Zhen
    Wang, Chaoqun
    Zhou, Chuanwei
    Yang, Jian
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)
  • [33] Learning cross-modal interaction for RGB-T tracking
    Chunyan XU
    Zhen CUI
    Chaoqun WANG
    Chuanwei ZHOU
    Jian YANG
    Science China(Information Sciences), 2023, 66 (01) : 320 - 321
  • [34] Learning cross-modal interaction for RGB-T tracking
    Chunyan Xu
    Zhen Cui
    Chaoqun Wang
    Chuanwei Zhou
    Jian Yang
    Science China Information Sciences, 2023, 66
  • [35] RGB-D salient object detection with asymmetric cross-modal fusion
    Yu M.
    Xing Z.-H.
    Liu Y.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (09): : 2487 - 2495
  • [36] Weighted Guided Optional Fusion Network for RGB-T Salient Object Detection
    Wang, Jie
    Li, Guoqiang
    Shi, Jie
    Xi, Jinwen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [37] Interactive context-aware network for RGB-T salient object detection
    Wang, Yuxuan
    Dong, Feng
    Zhu, Jinchao
    Chen, Jianren
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (28) : 72153 - 72174
  • [38] CAFCNet: Cross-modality asymmetric feature complement network for RGB-T salient object detection
    Jin, Dongze
    Shao, Feng
    Xie, Zhengxuan
    Mu, Baoyang
    Chen, Hangwei
    Jiang, Qiuping
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 247
  • [39] WaveNet: Wavelet Network With Knowledge Distillation for RGB-T Salient Object Detection
    Zhou, Wujie
    Sun, Fan
    Jiang, Qiuping
    Cong, Runmin
    Hwang, Jenq-Neng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3027 - 3039
  • [40] Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection
    Wang, Shuaihui
    Jiang, Fengyi
    Xu, Boqian
    SENSORS, 2023, 23 (21)