Adversarial Cross-modal Domain Adaptation for Multi-modal Semantic Segmentation in Autonomous Driving

被引:0
|
作者
Shi, Mengqi [1 ]
Cao, Haozhi [1 ]
Xie, Lihua [1 ]
Yang, Jianfei [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
关键词
D O I
10.1109/ICARCV57592.2022.10004265
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
3D semantic segmentation is a vital problem in autonomous driving. Vehicles rely on semantic segmentation to sense the surrounding environment and identify pedestrians, roads, and other vehicles. Though many datasets are publicly available, there exists a gap between public data and real-world scenarios due to the different weathers and environments, which is formulated as the domain shift. These days, the research for Unsupervised Domain Adaptation (UDA) rises for solving the problem of domain shift and the lack of annotated datasets. This paper aims to introduce adversarial learning and cross-modal networks (2D and 3D) to boost the performance of UDA for semantic segmentation across different datasets. With this goal, we design an adversarial training scheme with a domain discriminator and render the domain-invariant feature learning. Furthermore, we demonstrate that introducing 2D modalities can contribute to the improvement of 3D modalities by our method. Experimental results show that the proposed approach improves the mIoU by 7.53% compared to the baseline and has an improvement of 3.68% for the multi-modal performance.
引用
收藏
页码:850 / 855
页数:6
相关论文
共 50 条
  • [21] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
  • [22] Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation
    Chen, Yiyang
    Zhao, Shanshan
    Ding, Changxing
    Tang, Liyao
    Wang, Chaoyue
    Tao, Dacheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3866 - 3875
  • [23] Cross-modal domain generalization semantic segmentation based on fusion features
    Yue, Wanlin
    Zhou, Zhiheng
    Cao, Yinglie
    Liuman
    KNOWLEDGE-BASED SYSTEMS, 2024, 302
  • [24] MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation
    Cao, Haozhi
    Xu, Yuecong
    Yang, Jianfei
    Yin, Pengyu
    Yuan, Shenghai
    Xie, Lihua
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 9463 - 9470
  • [25] Cross-modal generative models for multi-modal plastic sorting
    Neo, Edward R. K.
    Low, Jonathan S. C.
    Goodship, Vannessa
    Coles, Stuart R.
    Debattista, Kurt
    JOURNAL OF CLEANER PRODUCTION, 2023, 415
  • [26] Cross-modal dynamic convolution for multi-modal emotion recognition
    Wen, Huanglu
    You, Shaodi
    Fu, Ying
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 78
  • [27] MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving
    Li, Jiale
    Dai, Hang
    Han, Hao
    Ding, Yong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21694 - 21704
  • [28] Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs
    Liang, Bin
    Lou, Chenwei
    Li, Xiang
    Gui, Lin
    Yang, Min
    Xu, Ruifeng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4707 - 4715
  • [29] Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval
    Meng, Min
    Sun, Jiaxuan
    Liu, Jigang
    Yu, Jun
    Wu, Jigang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1914 - 1926
  • [30] Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
    Xu, Xing
    Song, Jingkuan
    Lu, Huimin
    Yang, Yang
    Shen, Fumin
    Huang, Zi
    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 46 - 54