A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation

被引：22

作者：

Zhao, Shenlu ^{[1
,2
]}

Zhang, Qiang ^{[1
,2
]}

机构：

[1] Xidian Univ, Key Lab Elect Equipment Struct Design, Minist Educ, Xian 710071, Shaanxi, Peoples R China

[2] Xidian Univ, Ctr Complex Syst, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2023年 / 33卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Semantic segmentation; Data mining; Semantics; Lighting; Decoding; Thermal sensors; RGB-T semantic segmentation; feature divide-and-conquer strategy; multi-scale contextual information;

D O I：

10.1109/TCSVT.2022.3229359

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Similar to other multi-modal pixel-level prediction tasks, existing RGB-T semantic segmentation methods usually employ a two-stream structure to extract RGB and thermal infrared (TIR) features, respectively, and adopt the same fusion strategies to integrate different levels of unimodal features. This will result in inadequate extraction of unimodal features and exploitation of cross-modal information from the paired RGB and TIR images. Alternatively, in this paper, we present a novel RGB-T semantic segmentation model, i.e., FDCNet, where a feature divide-and-conquer strategy performs unimodal feature extraction and cross-modal feature fusion in one go. Concretely, we first employ a two-stream structure to extract unimodal low-level features, followed by a Siamese structure to extract unimodal high-level features from the paired RGB and TIR images. This concise but efficient structure enables to take into account both the modality discrepancies of low-level features and the underlying semantic consistency of high-level features across the paired RGB and TIR images. Furthermore, considering the characteristics of different layers of features, a Cross-modal Spatial Activation (CSA) module and a Cross-modal Channel Activation (CCA) module are presented for the fusion of low-level RGB and TIR features and for the fusion of high-level RGB and TIR features, respectively, thus facilitating the capture of cross-modal information. On top of that, with an embedded Cross-scale Interaction Context (CIC) module for mining multi-scale contextual information, our proposed model (i.e., FDCNet) for RGB-T semantic segmentation achieves new state-of-the-art experimental results on MFNet dataset and PST900 dataset.

引用

页码：2892 / 2905

页数：14

共 50 条

[1] DaCFN: divide-and-conquer fusion network for RGB-T object detection
Wang, Bofan
Zhao, Haitao
Zhuang, Yi
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (07) : 2407 - 2420
[2] DaCFN: divide-and-conquer fusion network for RGB-T object detection
Bofan Wang
Haitao Zhao
Yi Zhuang
International Journal of Machine Learning and Cybernetics, 2023, 14 : 2407 - 2420
[3] Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection
Tang, Hao
Li, Zechao
Zhang, Dong
He, Shengfeng
Tang, Jinhui
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (03) : 1958 - 1974
[4] A Lightweight RGB-T Fusion Network for Practical Semantic Segmentation
Zhang, Haoyuan
Li, Zifeng
Wu, Zhenyu
Wang, Danwei
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 4233 - 4238
[5] Resolving semantic conflicts in RGB-T semantic segmentation
Zhao, Shenlu
Jin, Ziniu
Jiao, Qiang
Zhang, Qiang
Han, Jungong
PATTERN RECOGNITION, 2025, 162
[6] AGFNet: Adaptive Gated Fusion Network for RGB-T Semantic Segmentation
Zhou, Xiaofei
Wu, Xiaoling
Bao, Liuxin
Yin, Haibing
Jiang, Qiuping
Zhang, Jiyong
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
[7] Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation
Xu, Xiaogang
Zhao, Hengshuang
Jia, Jiaya
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7466 - 7475
[8] MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation
Liu, Jinfu
Liu, Hong
Li, Xia
Ren, Jiale
Xu, Xinhua
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1686 - 1699
[9] Context-Aware Interaction Network for RGB-T Semantic Segmentation
Lv, Ying
Liu, Zhi
Li, Gongyang
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6348 - 6360
[10] Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation
Wu, Wei
Chu, Tao
Liu, Qiong
PATTERN RECOGNITION, 2022, 131

← 1 2 3 4 5 →