Cross-modal domain generalization semantic segmentation based on fusion features

被引:0
|
作者
Yue, Wanlin [1 ]
Zhou, Zhiheng [1 ]
Cao, Yinglie [2 ]
Liuman [3 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510640, Peoples R China
[2] Guangzhou City Univ Technol, Sch Elect & Informat Engn & Commun Engn, Guangzhou 510850, Peoples R China
[3] Wise Secur Technol Guangzhou Co Ltd, Guangzhou 510663, Peoples R China
关键词
Domain generalization; Semantic segmentation; Cross-modal;
D O I
10.1016/j.knosys.2024.112356
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The primary techniques for domain generalization in semantic segmentation revolve around domain randomization and feature whitening. Although less commonly employed, methods based on cross-modality have demonstrated effective outcomes. This paper introduces enhancements to cross-modal feature alignment by redesigning the feature alignment module. This redesign facilitates alignment across different modalities by leveraging fusion features derived from both visual and textual inputs. These fusion features provide a more effective anchor point for alignment, enhancing the transfer of semantic information from textual to visual domains. Furthermore, the decoder plays a crucial role in the model as its ability to categorize features directly impacts the segmentation performance of the entire model. To enhance the decoder's capability, this study employs the fusion features as the input for the decoder, with image labels providing the supervision. Experimental results indicate that our approach significantly enhances the model's generalization capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation
    Wu, Wei
    Chu, Tao
    Liu, Qiong
    PATTERN RECOGNITION, 2022, 131
  • [22] Cross-modal semantic priming
    Tabossi, P
    LANGUAGE AND COGNITIVE PROCESSES, 1996, 11 (06): : 569 - 576
  • [23] Cross-Modal Semantic Communications
    Li, Ang
    Wei, Xin
    Wu, Dan
    Zhou, Liang
    IEEE WIRELESS COMMUNICATIONS, 2022, 29 (06) : 144 - 151
  • [24] CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training
    Yin, Yifang
    Hu, Wenmiao
    Liu, Zhenguang
    Wang, Guanfeng
    Xiang, Shili
    Zimmermann, Roger
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21729 - 21739
  • [25] A Retinal Vessel Segmentation Network Fusing Cross-Modal Features
    Yu, Xiaosheng
    Chi, Jianning
    Xu, Ming
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2024, E107A (07) : 1071 - 1075
  • [26] Research on cross-modal emotion recognition based on multi-layer semantic fusion
    Xu Z.
    Gao Y.
    Mathematical Biosciences and Engineering, 2024, 21 (02) : 2488 - 2514
  • [27] Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation
    Vobecky, Antonin
    Hurych, David
    Simeoni, Oriane
    Gidaris, Spyros
    Bursuc, Andrei
    Perez, Patrick
    Sivic, Josef
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [28] BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain Generalization of 3D Semantic Segmentation
    Li, Miaoyu
    Zhang, Yachao
    Ma, Xu
    Qu, Yanyun
    Fu, Yun
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11598 - 11608
  • [29] CROSS-MODAL GENERALIZATION OF HABITUATION IN THE RAT
    ADES, C
    SALLES, JB
    PERCEPTUAL AND MOTOR SKILLS, 1980, 50 (03) : 1345 - 1346
  • [30] A language-guided cross-modal semantic fusion retrieval method
    Zhu, Ligu
    Zhou, Fei
    Wang, Suping
    Shi, Lei
    Kou, Feifei
    Li, Zeyu
    Zhou, Pengpeng
    SIGNAL PROCESSING, 2025, 234