MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

被引:38
|
作者
Li, Jiale [1 ]
Dai, Hang [2 ]
Han, Hao [3 ]
Ding, Yong [3 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou, Peoples R China
[2] Univ Glasgow, Sch Comp Sci, Glasgow, Lanark, Scotland
[3] Zhejiang Univ, Sch Micronano Elect, Hangzhou, Peoples R China
关键词
REPRESENTATION;
D O I
10.1109/CVPR52729.2023.02078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR and camera are two modalities available for 3D semantic segmentation in autonomous driving. The popular LiDAR-only methods severely suffer from inferior segmentation on small and distant objects due to insufficient laser points, while the robust multi-modal solution is under-explored, where we investigate three crucial inherent difficulties: modality heterogeneity, limited sensor field of view intersection, and multi-modal data augmentation. We propose a multi-modal 3D semantic segmentation model (MSeg3D) with joint intra-modal feature extraction and inter-modal feature fusion to mitigate the modality heterogeneity. The multi-modal fusion in MSeg3D consists of geometry-based feature fusion GF-Phase, cross-modal feature completion, and semantic-based feature fusion SF-Phase on all visible points. The multi-modal data augmentation is reinvigorated by applying asymmetric transformations on LiDAR point cloud and multi-camera images individually, which benefits the model training with diversified augmentation transformations. MSeg3D achieves state-of-the-art results on nuScenes, Waymo, and SemanticKITTI datasets. Under the malfunctioning multi-camera input and the multi-frame point clouds input, MSeg3D still shows robustness and improves the LiDARonly baseline. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
引用
收藏
页码:21694 / 21704
页数:11
相关论文
共 50 条
  • [1] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Wang, Yingjie
    Mao, Qiuyu
    Zhu, Hanqi
    Deng, Jiajun
    Zhang, Yu
    Ji, Jianmin
    Li, Houqiang
    Zhang, Yanyong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 2122 - 2152
  • [2] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Yingjie Wang
    Qiuyu Mao
    Hanqi Zhu
    Jiajun Deng
    Yu Zhang
    Jianmin Ji
    Houqiang Li
    Yanyong Zhang
    International Journal of Computer Vision, 2023, 131 : 2122 - 2152
  • [3] OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
    Wang, Guoqing
    Wang, Zhongdao
    Tang, Pin
    Zheng, Jilai
    Ren, Xiangxuan
    Feng, Bailan
    Ma, Chao
    COMPUTER VISION - ECCV 2024, PT XX, 2025, 15078 : 95 - 112
  • [4] Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
    Khamsehashari, Razieh
    Schill, Kerstin
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 263 - 267
  • [5] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy
    Wang, Li
    Zhang, Xinyu
    Song, Ziying
    Bi, Jiangfeng
    Zhang, Guoxin
    Wei, Haiyue
    Tang, Liyao
    Yang, Lei
    Li, Jun
    Jia, Caiyan
    Zhao, Lijun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3781 - 3798
  • [6] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
    Chiu, Hsu-kuang
    Lie, Jie
    Ambrus, Rares
    Bohg, Jeannette
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
  • [7] Domain generalization of 3D semantic segmentation in autonomous driving
    Sanchez, Jules
    Deschaud, Jean-Emmanuel
    Goulette, Francois
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18031 - 18041
  • [8] Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
    Cao, Haozhi
    Xu, Yuecong
    Yang, Jianfei
    Yin, Pengyu
    Yuan, Shenghai
    Xie, Lihua
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18763 - 18773
  • [9] Robust 3D Semantic Segmentation Method Based on Multi-Modal Collaborative Learning
    Ni, Peizhou
    Li, Xu
    Xu, Wang
    Zhou, Xiaojing
    Jiang, Tao
    Hu, Weiming
    REMOTE SENSING, 2024, 16 (03)
  • [10] MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation
    Cao, Haozhi
    Xu, Yuecong
    Yang, Jianfei
    Yin, Pengyu
    Yuan, Shenghai
    Xie, Lihua
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 9463 - 9470