A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation

被引:22
|
作者
Zhao, Shenlu [1 ,2 ]
Zhang, Qiang [1 ,2 ]
机构
[1] Xidian Univ, Key Lab Elect Equipment Struct Design, Minist Educ, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Ctr Complex Syst, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Semantic segmentation; Data mining; Semantics; Lighting; Decoding; Thermal sensors; RGB-T semantic segmentation; feature divide-and-conquer strategy; multi-scale contextual information;
D O I
10.1109/TCSVT.2022.3229359
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Similar to other multi-modal pixel-level prediction tasks, existing RGB-T semantic segmentation methods usually employ a two-stream structure to extract RGB and thermal infrared (TIR) features, respectively, and adopt the same fusion strategies to integrate different levels of unimodal features. This will result in inadequate extraction of unimodal features and exploitation of cross-modal information from the paired RGB and TIR images. Alternatively, in this paper, we present a novel RGB-T semantic segmentation model, i.e., FDCNet, where a feature divide-and-conquer strategy performs unimodal feature extraction and cross-modal feature fusion in one go. Concretely, we first employ a two-stream structure to extract unimodal low-level features, followed by a Siamese structure to extract unimodal high-level features from the paired RGB and TIR images. This concise but efficient structure enables to take into account both the modality discrepancies of low-level features and the underlying semantic consistency of high-level features across the paired RGB and TIR images. Furthermore, considering the characteristics of different layers of features, a Cross-modal Spatial Activation (CSA) module and a Cross-modal Channel Activation (CCA) module are presented for the fusion of low-level RGB and TIR features and for the fusion of high-level RGB and TIR features, respectively, thus facilitating the capture of cross-modal information. On top of that, with an embedded Cross-scale Interaction Context (CIC) module for mining multi-scale contextual information, our proposed model (i.e., FDCNet) for RGB-T semantic segmentation achieves new state-of-the-art experimental results on MFNet dataset and PST900 dataset.
引用
收藏
页码:2892 / 2905
页数:14
相关论文
共 50 条
  • [31] Few-Shot Segmentation via Divide-and-Conquer Proxies
    Chunbo Lang
    Gong Cheng
    Binfei Tu
    Junwei Han
    International Journal of Computer Vision, 2024, 132 : 261 - 283
  • [32] NEW-ZEALAND S AND T - DIVIDE-AND-CONQUER
    GEDDES, R
    SEARCH, 1993, 24 (09): : 252 - 252
  • [33] Multiobjective Guided Divide-and-Conquer Network for Hyperspectral Pansharpening
    Wu, Xiande
    Feng, Jie
    Shang, Ronghua
    Zhang, Xiangrong
    Hao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [34] Multiobjective Guided Divide-and-Conquer Network for Hyperspectral Pansharpening
    Wu, Xiande
    Feng, Jie
    Shang, Ronghua
    Zhang, Xiangrong
    Jiao, Licheng
    IEEE Transactions on Geoscience and Remote Sensing, 2022, 60
  • [35] ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation
    Zhang, Qiang
    Zhao, Shenlu
    Luo, Yongjiang
    Zhang, Dingwen
    Huang, Nianchang
    Han, Jungong
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2633 - 2642
  • [36] BMDENet: Bi-Directional Modality Difference Elimination Network for Few-Shot RGB-T Semantic Segmentation
    Zhao, Ying
    Song, Kechen
    Zhang, Yiming
    Yan, Yunhui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (11) : 4266 - 4270
  • [37] FADSiamNet: feature affinity drift siamese network for RGB-T target tracking
    Li, Haiyan
    Cao, Yonghui
    Guo, Lei
    Chen, Quan
    Ding, Zhaisheng
    Xie, Shidong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2779 - 2799
  • [38] Modeling of Failure Prediction Bayesian Network with Divide-and-Conquer Principle
    Cai, Zhiqiang
    Si, Weitao
    Si, Shubin
    Sun, Shudong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [39] MFCNet: Multimodal Feature Fusion Network for RGB-T Vehicle Density Estimation
    Qin, Ling-Xiao
    Sun, Hong-Mei
    Duan, Xiao-Meng
    Che, Cheng-Yue
    Jia, Rui-Sheng
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (04): : 4207 - 4219
  • [40] Cascaded Feature Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Chen, Guangyong
    Daniel Cohen-Or
    Heng, Pheng-Ann
    Huang, Hui
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1320 - 1328