Cross-level interaction fusion network-based RGB-T semantic segmentation for distant targets

被引:0
|
作者
Chen, Yu [1 ]
Li, Xiang [1 ]
Luan, Chao [2 ]
Hou, Weimin [2 ]
Liu, Haochen [2 ]
Zhu, Zihui [3 ]
Xue, Lian [3 ]
Zhang, Jianqi [1 ]
Liu, Delian [1 ]
Wu, Xin [1 ]
Wei, Linfang [1 ]
Jian, Chaochao [1 ]
Li, Jinze [1 ]
机构
[1] Xidian Univ, Sch Optoelect Engn, Xian 710071, Peoples R China
[2] Beijing Inst Control & Elect Technol, Beijing 100038, Peoples R China
[3] Natl Key Lab Sci & Technol Test Phys & Numer Math, Beijing 100076, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantic segmentation; Feature fusion; Cross modality; Multi-scale information; Distant object;
D O I
10.1016/j.patcog.2024.111218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T segmentation represents an innovative approach driven by advancements in multispectral detection and is poised to replace traditional RGB segmentation methods. An effective cross-modality feature fusion module is essential for this technology. The precise segmentation of distant objects is another significant challenge. Focused on these two areas, we propose an end-to-end distant object feature fusion network (DOFFNet) for RGB-T segmentation. Initially, we introduce a cross-level interaction fusion strategy (CLIF) and an inter-correlation fusion method (IFFM) in the encoder to enhance multi-scale feature expression and improve fusion accuracy. Subsequently, we propose a residual dense pixel convolution (R-DPC) in the decoder with a trainable upsampling unit that dynamically reconstructs information lost during encoding, particularly for distant objects whose features may vanish after pooling. Experimental results show that our DOFFNet achieves a top mean pixel accuracy of 75.8% and dramatically improves accuracy for four classes, including objects occupying as little as 0.2%-2% of total pixels. This improvement ensures more reliable and effective performance in practical applications, particularly in scenarios where small object detection is critical. Moreover, it demonstrates potential applicability in other fields like medical imaging and remote sensing.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Lightweight RGB-T Fusion Network for Practical Semantic Segmentation
    Zhang, Haoyuan
    Li, Zifeng
    Wu, Zhenyu
    Wang, Danwei
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 4233 - 4238
  • [2] AGFNet: Adaptive Gated Fusion Network for RGB-T Semantic Segmentation
    Zhou, Xiaofei
    Wu, Xiaoling
    Bao, Liuxin
    Yin, Haibing
    Jiang, Qiuping
    Zhang, Jiyong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
  • [3] Context-Aware Interaction Network for RGB-T Semantic Segmentation
    Lv, Ying
    Liu, Zhi
    Li, Gongyang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6348 - 6360
  • [4] Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation
    Wu, Wei
    Chu, Tao
    Liu, Qiong
    PATTERN RECOGNITION, 2022, 131
  • [5] Rgb-t semantic segmentation based on cross-operational fusion attention in autonomous driving scenario
    Zhang, Jiyou
    Zhang, Rongfen
    Yuan, Wenhao
    Liu, Yuhong
    EVOLVING SYSTEMS, 2024, 15 (04) : 1429 - 1440
  • [6] MS-IRTNet: Multistage information interaction network for RGB-T semantic segmentation
    Zhang, Zhiwei
    Liu, Yisha
    Xue, Weimin
    INFORMATION SCIENCES, 2023, 647
  • [7] CIGF-Net: Cross-Modality Interaction and Global-Feature Fusion for RGB-T Semantic Segmentation
    Zhang, Zhiwei
    Liu, Yisha
    Xue, Weimin
    Zhuang, Yan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,
  • [8] Knowledge Distillation SegFormer-Based Network for RGB-T Semantic Segmentation
    Zhou, Wujie
    Gong, Tingting
    Yan, Weiqing
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025, 55 (03): : 2170 - 2182
  • [9] MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation
    Liu, Jinfu
    Liu, Hong
    Li, Xia
    Ren, Jiale
    Xu, Xinhua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 1686 - 1699
  • [10] A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation
    Zhao, Shenlu
    Zhang, Qiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (06) : 2892 - 2905