Cross-modal domain generalization semantic segmentation based on fusion features

被引:0
|
作者
Yue, Wanlin [1 ]
Zhou, Zhiheng [1 ]
Cao, Yinglie [2 ]
Liuman [3 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510640, Peoples R China
[2] Guangzhou City Univ Technol, Sch Elect & Informat Engn & Commun Engn, Guangzhou 510850, Peoples R China
[3] Wise Secur Technol Guangzhou Co Ltd, Guangzhou 510663, Peoples R China
关键词
Domain generalization; Semantic segmentation; Cross-modal;
D O I
10.1016/j.knosys.2024.112356
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The primary techniques for domain generalization in semantic segmentation revolve around domain randomization and feature whitening. Although less commonly employed, methods based on cross-modality have demonstrated effective outcomes. This paper introduces enhancements to cross-modal feature alignment by redesigning the feature alignment module. This redesign facilitates alignment across different modalities by leveraging fusion features derived from both visual and textual inputs. These fusion features provide a more effective anchor point for alignment, enhancing the transfer of semantic information from textual to visual domains. Furthermore, the decoder plays a crucial role in the model as its ability to categorize features directly impacts the segmentation performance of the entire model. To enhance the decoder's capability, this study employs the fusion features as the input for the decoder, with image labels providing the supervision. Experimental results indicate that our approach significantly enhances the model's generalization capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [2] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Duan, Zaipeng
    Huang, Xiao
    Ma, Jie
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 6361 - 6375
  • [3] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Zaipeng Duan
    Xiao Huang
    Jie Ma
    Neural Processing Letters, 2023, 55 : 6361 - 6375
  • [4] Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation
    Chen, Yiyang
    Zhao, Shanshan
    Ding, Changxing
    Tang, Liyao
    Wang, Chaoyue
    Tao, Dacheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3866 - 3875
  • [5] A Cross-Modal Feature Fusion Model Based on ConvNeXt for RGB-D Semantic Segmentation
    Tang, Xiaojiang
    Li, Baoxia
    Guo, Junwei
    Chen, Wenzhuo
    Zhang, Dan
    Huang, Feng
    MATHEMATICS, 2023, 11 (08)
  • [6] HR and LiDAR Data Collaborative Semantic Segmentation Based on Adaptive Cross-Modal Fusion Network
    Ye, Zhen
    Li, Zhen
    Wang, Nan
    Li, Yuan
    Li, Wei
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 12153 - 12168
  • [7] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [8] CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation With Transformers
    Zhang, Jiaming
    Liu, Huayao
    Yang, Kailun
    Hu, Xinxin
    Liu, Ruiping
    Stiefelhagen, Rainer
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 14679 - 14694
  • [9] Cross-modal Learning for Domain Adaptation in 3D Semantic Segmentation
    Jaritz, Maximilian
    Vu, Tuan-Hung
    de Charette, Raoul
    Wirbel, Émilie
    Pérez, Patrick
    arXiv, 2021,
  • [10] Cross-Modal Learning for Domain Adaptation in 3D Semantic Segmentation
    Jaritz, Maximilian
    Tuan-Hung Vu
    de Charette, Raoul
    Wirbel, Emilie
    Perez, Patrick
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 1533 - 1544