Cross-modal domain generalization semantic segmentation based on fusion features

被引:0
|
作者
Yue, Wanlin [1 ]
Zhou, Zhiheng [1 ]
Cao, Yinglie [2 ]
Liuman [3 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510640, Peoples R China
[2] Guangzhou City Univ Technol, Sch Elect & Informat Engn & Commun Engn, Guangzhou 510850, Peoples R China
[3] Wise Secur Technol Guangzhou Co Ltd, Guangzhou 510663, Peoples R China
关键词
Domain generalization; Semantic segmentation; Cross-modal;
D O I
10.1016/j.knosys.2024.112356
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The primary techniques for domain generalization in semantic segmentation revolve around domain randomization and feature whitening. Although less commonly employed, methods based on cross-modality have demonstrated effective outcomes. This paper introduces enhancements to cross-modal feature alignment by redesigning the feature alignment module. This redesign facilitates alignment across different modalities by leveraging fusion features derived from both visual and textual inputs. These fusion features provide a more effective anchor point for alignment, enhancing the transfer of semantic information from textual to visual domains. Furthermore, the decoder plays a crucial role in the model as its ability to categorize features directly impacts the segmentation performance of the entire model. To enhance the decoder's capability, this study employs the fusion features as the input for the decoder, with image labels providing the supervision. Experimental results indicate that our approach significantly enhances the model's generalization capabilities.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Cross-modal generalization of value-based attentional priority
    Gregoire, Laurent
    Mrkonja, Lana
    Anderson, Brian A.
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2022, 84 (08) : 2423 - 2431
  • [42] Cross-modal semantic priming in schizophrenia
    Surguladze, S
    Rossell, S
    Rabe-Hesketh, S
    David, AS
    JOURNAL OF THE INTERNATIONAL NEUROPSYCHOLOGICAL SOCIETY, 2002, 8 (07) : 884 - 892
  • [43] Semantic deep cross-modal hashing
    Lin, Qiubin
    Cao, Wenming
    He, Zhihai
    He, Zhiquan
    NEUROCOMPUTING, 2020, 396 (396) : 113 - 122
  • [44] THE DEVELOPMENT OF CROSS-MODAL SEMANTIC INTEGRATION
    MURRAY, S
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1982, 35 (MAY): : 214 - 214
  • [45] Cross-Modal Hash Retrieval Model for Semantic Segmentation Network for Digital Libraries
    Tang, Siyu
    Yin, Jun
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 58 - 66
  • [46] Semantic segmentation based on fusion of features and classifiers
    Xue, Yanbing
    Geng, Huiqiang
    Zhang, Hua
    Xue, Zhenshan
    Xu, Guangping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 22199 - 22211
  • [47] Semantic segmentation based on fusion of features and classifiers
    Yanbing Xue
    Huiqiang Geng
    Hua Zhang
    Zhenshan Xue
    Guangping Xu
    Multimedia Tools and Applications, 2018, 77 : 22199 - 22211
  • [48] Instance Segmentation with Cross-Modal Consistency
    Zhu, Alex Zihao
    Casser, Vincent
    Mahjourian, Reza
    Kretzschmar, Henrik
    Pirk, Soren
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 2009 - 2016
  • [49] Domain generalization for semantic segmentation: a survey
    Rafi, Taki Hasan
    Mahjabin, Ratul
    Ghosh, Emon
    Ko, Young-Woong
    Lee, Jeong-Gun
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (09)
  • [50] Cross-Modal Collaborative Evolution Reinforced by Semantic Coupling for Image Registration and Fusion
    Xiong, Yan
    Kong, Jun
    Zhang, Yunde
    Lu, Ming
    Jiang, Min
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74