Mitigating Modality Discrepancies for RGB-T Semantic Segmentation

被引:23
|
作者
Zhao, Shenlu [1 ,2 ]
Liu, Yichen [1 ,2 ]
Jiao, Qiang [1 ,2 ]
Zhang, Qiang [1 ,2 ]
Han, Jungong [3 ]
机构
[1] Xidian Univ, Key Lab Elect Equipment Struct Design, Minist Educ, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Ctr Complex Syst, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China
[3] Aberystwyth Univ, Comp Sci Dept, Aberystwyth SY23 3FL, England
基金
中国国家自然科学基金;
关键词
Bridging-then-fusing; contextual information; dataset; modality discrepancy reduction; RGB-T semantic segmentation; NETWORK; CNN;
D O I
10.1109/TNNLS.2022.3233089
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation models gain robustness against adverse illumination conditions by taking advantage of complementary information from visible and thermal infrared (RGB-T) images. Despite its importance, most existing RGB-T semantic segmentation models directly adopt primitive fusion strategies, such as elementwise summation, to integrate multimodal features. Such strategies, unfortunately, overlook the modality discrepancies caused by inconsistent unimodal features obtained by two independent feature extractors, thus hindering the exploitation of cross-modal complementary information within the multimodal data. For that, we propose a novel network for RGB-T semantic segmentation, i.e. MDRNet+, which is an improved version of our previous work ABMDRNet. The core of MDRNet+ is a brand new idea, termed the strategy of bridging-then-fusing, which mitigates modality discrepancies before cross-modal feature fusion. Concretely, an improved Modality Discrepancy Reduction (MDR+) subnetwork is designed, which first extracts unimodal features and reduces their modality discrepancies. Afterward, discriminative multimodal features for RGB-T semantic segmentation are adaptively selected and integrated via several channel-weighted fusion (CWF) modules. Furthermore, a multiscale spatial context (MSC) module and a multiscale channel context (MCC) module are presented to effectively capture the contextual information. Finally, we elaborately assemble a challenging RGB-T semantic segmentation dataset, i.e., RTSS, for urban scene understanding to mitigate the lack of well-annotated training data. Comprehensive experiments demonstrate that our proposed model surpasses other state-of-the-art models on the MFNet, PST900, and RTSS datasets remarkably.
引用
收藏
页码:9380 / 9394
页数:15
相关论文
共 50 条
  • [1] Resolving semantic conflicts in RGB-T semantic segmentation
    Zhao, Shenlu
    Jin, Ziniu
    Jiao, Qiang
    Zhang, Qiang
    Han, Jungong
    PATTERN RECOGNITION, 2025, 162
  • [2] Mask-guided modality difference reduction network for RGB-T semantic segmentation
    Liang, Wenli
    Yang, Yuanjian
    Li, Fangyu
    Long, Xi
    Shan, Caifeng
    NEUROCOMPUTING, 2023, 523 : 9 - 17
  • [3] RGB-T Semantic Segmentation With Location, Activation, and Sharpening
    Li, Gongyang
    Wang, Yike
    Liu, Zhi
    Zhang, Xinpeng
    Zeng, Dan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1223 - 1235
  • [4] A Lightweight RGB-T Fusion Network for Practical Semantic Segmentation
    Zhang, Haoyuan
    Li, Zifeng
    Wu, Zhenyu
    Wang, Danwei
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 4233 - 4238
  • [5] AGFNet: Adaptive Gated Fusion Network for RGB-T Semantic Segmentation
    Zhou, Xiaofei
    Wu, Xiaoling
    Bao, Liuxin
    Yin, Haibing
    Jiang, Qiuping
    Zhang, Jiyong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
  • [6] MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation
    Liu, Jinfu
    Liu, Hong
    Li, Xia
    Ren, Jiale
    Xu, Xinhua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1686 - 1699
  • [7] Context-Aware Interaction Network for RGB-T Semantic Segmentation
    Lv, Ying
    Liu, Zhi
    Li, Gongyang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6348 - 6360
  • [8] A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation
    Zhao, Shenlu
    Zhang, Qiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (06) : 2892 - 2905
  • [9] CIGF-Net: Cross-Modality Interaction and Global-Feature Fusion for RGB-T Semantic Segmentation
    Zhang, Zhiwei
    Liu, Yisha
    Xue, Weimin
    Zhuang, Yan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [10] ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation
    Zhang, Qiang
    Zhao, Shenlu
    Luo, Yongjiang
    Zhang, Dingwen
    Huang, Nianchang
    Han, Jungong
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2633 - 2642