A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation

被引:22
|
作者
Zhao, Shenlu [1 ,2 ]
Zhang, Qiang [1 ,2 ]
机构
[1] Xidian Univ, Key Lab Elect Equipment Struct Design, Minist Educ, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Ctr Complex Syst, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Semantic segmentation; Data mining; Semantics; Lighting; Decoding; Thermal sensors; RGB-T semantic segmentation; feature divide-and-conquer strategy; multi-scale contextual information;
D O I
10.1109/TCSVT.2022.3229359
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Similar to other multi-modal pixel-level prediction tasks, existing RGB-T semantic segmentation methods usually employ a two-stream structure to extract RGB and thermal infrared (TIR) features, respectively, and adopt the same fusion strategies to integrate different levels of unimodal features. This will result in inadequate extraction of unimodal features and exploitation of cross-modal information from the paired RGB and TIR images. Alternatively, in this paper, we present a novel RGB-T semantic segmentation model, i.e., FDCNet, where a feature divide-and-conquer strategy performs unimodal feature extraction and cross-modal feature fusion in one go. Concretely, we first employ a two-stream structure to extract unimodal low-level features, followed by a Siamese structure to extract unimodal high-level features from the paired RGB and TIR images. This concise but efficient structure enables to take into account both the modality discrepancies of low-level features and the underlying semantic consistency of high-level features across the paired RGB and TIR images. Furthermore, considering the characteristics of different layers of features, a Cross-modal Spatial Activation (CSA) module and a Cross-modal Channel Activation (CCA) module are presented for the fusion of low-level RGB and TIR features and for the fusion of high-level RGB and TIR features, respectively, thus facilitating the capture of cross-modal information. On top of that, with an embedded Cross-scale Interaction Context (CIC) module for mining multi-scale contextual information, our proposed model (i.e., FDCNet) for RGB-T semantic segmentation achieves new state-of-the-art experimental results on MFNet dataset and PST900 dataset.
引用
收藏
页码:2892 / 2905
页数:14
相关论文
共 50 条
  • [1] DaCFN: divide-and-conquer fusion network for RGB-T object detection
    Wang, Bofan
    Zhao, Haitao
    Zhuang, Yi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (07) : 2407 - 2420
  • [2] DaCFN: divide-and-conquer fusion network for RGB-T object detection
    Bofan Wang
    Haitao Zhao
    Yi Zhuang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 2407 - 2420
  • [3] Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection
    Tang, Hao
    Li, Zechao
    Zhang, Dong
    He, Shengfeng
    Tang, Jinhui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1958 - 1974
  • [4] A Lightweight RGB-T Fusion Network for Practical Semantic Segmentation
    Zhang, Haoyuan
    Li, Zifeng
    Wu, Zhenyu
    Wang, Danwei
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 4233 - 4238
  • [5] Resolving semantic conflicts in RGB-T semantic segmentation
    Zhao, Shenlu
    Jin, Ziniu
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    PATTERN RECOGNITION, 2025, 162
  • [6] AGFNet: Adaptive Gated Fusion Network for RGB-T Semantic Segmentation
    Zhou, Xiaofei
    Wu, Xiaoling
    Bao, Liuxin
    Yin, Haibing
    Jiang, Qiuping
    Zhang, Jiyong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
  • [7] Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation
    Xu, Xiaogang
    Zhao, Hengshuang
    Jia, Jiaya
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7466 - 7475
  • [8] MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation
    Liu, Jinfu
    Liu, Hong
    Li, Xia
    Ren, Jiale
    Xu, Xinhua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1686 - 1699
  • [9] Context-Aware Interaction Network for RGB-T Semantic Segmentation
    Lv, Ying
    Liu, Zhi
    Li, Gongyang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6348 - 6360
  • [10] Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation
    Wu, Wei
    Chu, Tao
    Liu, Qiong
    PATTERN RECOGNITION, 2022, 131