Adversarial Cross-modal Domain Adaptation for Multi-modal Semantic Segmentation in Autonomous Driving

被引:0
|
作者
Shi, Mengqi [1 ]
Cao, Haozhi [1 ]
Xie, Lihua [1 ]
Yang, Jianfei [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
关键词
D O I
10.1109/ICARCV57592.2022.10004265
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic segmentation is a vital problem in autonomous driving. Vehicles rely on semantic segmentation to sense the surrounding environment and identify pedestrians, roads, and other vehicles. Though many datasets are publicly available, there exists a gap between public data and real-world scenarios due to the different weathers and environments, which is formulated as the domain shift. These days, the research for Unsupervised Domain Adaptation (UDA) rises for solving the problem of domain shift and the lack of annotated datasets. This paper aims to introduce adversarial learning and cross-modal networks (2D and 3D) to boost the performance of UDA for semantic segmentation across different datasets. With this goal, we design an adversarial training scheme with a domain discriminator and render the domain-invariant feature learning. Furthermore, we demonstrate that introducing 2D modalities can contribute to the improvement of 3D modalities by our method. Experimental results show that the proposed approach improves the mIoU by 7.53% compared to the baseline and has an improvement of 3.68% for the multi-modal performance.
引用
收藏
页码:850 / 855
页数:6
相关论文
共 50 条
  • [31] Multi-modal Experts Network for Autonomous Driving
    Fang, Shihong
    Choromanska, Anna
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 6439 - 6445
  • [32] Complementarity is the king: Multi-modal and multi-grained hierarchical semantic enhancement network for cross-modal retrieval
    Pei, Xinlei
    Liu, Zheng
    Gao, Shanshan
    Su, Yijun
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 216
  • [33] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [34] Cross-modal semantic transfer for point cloud semantic segmentation
    Cao, Zhen
    Mi, Xiaoxin
    Qiu, Bo
    Cao, Zhipeng
    Long, Chen
    Yan, Xinrui
    Zheng, Chao
    Dong, Zhen
    Yang, Bisheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 265 - 279
  • [35] Shared Cross-Modal Trajectory Prediction for Autonomous Driving
    Choi, Chiho
    Choi, Joon Hee
    Li, Jiachen
    Malla, Srikanth
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 244 - 253
  • [36] CMIT-Net: a cross-modal information transfer network for multi-modal brain tumor segmentation
    Xu, Shoukun
    Tang, Rui
    Chen, Jialu
    Yuan, Baohua
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (03)
  • [37] CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
    Zolfaghari, Mohammadreza
    Zhu, Yi
    Gehler, Peter
    Brox, Thomas
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1430 - 1439
  • [38] Semantic consistent adversarial cross-modal retrieval exploiting semantic similarity
    Weihua Ou
    Ruisheng Xuan
    Jianping Gou
    Quan Zhou
    Yongfeng Cao
    Multimedia Tools and Applications, 2020, 79 : 14733 - 14750
  • [39] Semantic consistent adversarial cross-modal retrieval exploiting semantic similarity
    Ou, Weihua
    Xuan, Ruisheng
    Gou, Jianping
    Zhou, Quan
    Cao, Yongfeng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 14733 - 14750
  • [40] Cross-modal incongruity aligning and collaborating for multi-modal sarcasm detection
    Wang, Jie
    Yang, Yan
    Jiang, Yongquan
    Ma, Minbo
    Xie, Zhuyang
    Li, Tianrui
    INFORMATION FUSION, 2024, 103