MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

被引:38
|
作者
Li, Jiale [1 ]
Dai, Hang [2 ]
Han, Hao [3 ]
Ding, Yong [3 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou, Peoples R China
[2] Univ Glasgow, Sch Comp Sci, Glasgow, Lanark, Scotland
[3] Zhejiang Univ, Sch Micronano Elect, Hangzhou, Peoples R China
关键词
REPRESENTATION;
D O I
10.1109/CVPR52729.2023.02078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR and camera are two modalities available for 3D semantic segmentation in autonomous driving. The popular LiDAR-only methods severely suffer from inferior segmentation on small and distant objects due to insufficient laser points, while the robust multi-modal solution is under-explored, where we investigate three crucial inherent difficulties: modality heterogeneity, limited sensor field of view intersection, and multi-modal data augmentation. We propose a multi-modal 3D semantic segmentation model (MSeg3D) with joint intra-modal feature extraction and inter-modal feature fusion to mitigate the modality heterogeneity. The multi-modal fusion in MSeg3D consists of geometry-based feature fusion GF-Phase, cross-modal feature completion, and semantic-based feature fusion SF-Phase on all visible points. The multi-modal data augmentation is reinvigorated by applying asymmetric transformations on LiDAR point cloud and multi-camera images individually, which benefits the model training with diversified augmentation transformations. MSeg3D achieves state-of-the-art results on nuScenes, Waymo, and SemanticKITTI datasets. Under the malfunctioning multi-camera input and the multi-frame point clouds input, MSeg3D still shows robustness and improves the LiDARonly baseline. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
引用
收藏
页码:21694 / 21704
页数:11
相关论文
共 50 条
  • [41] LeSSS: Learned Shared Semantic Spaces for Relating Multi-Modal Representations of 3D Shapes
    Herzog, Robert
    Mewes, Daniel
    Wand, Michael
    Guibas, Leonidas
    Seidel, Hans-Peter
    COMPUTER GRAPHICS FORUM, 2015, 34 (05) : 141 - 151
  • [42] Efficient multi-modal high-precision semantic segmentation from MLS point cloud without 3D annotation
    Wang, Yuan
    Sun, Pei
    Chu, Wenbo
    Li, Yuhao
    Chen, Yiping
    Lin, Hui
    Dong, Zhen
    Yang, Bisheng
    He, Chao
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 135
  • [43] Multi-modal 2D and 3D biometrics for face recognition
    Chang, KI
    Bowyer, KW
    Flynn, PJ
    IEEE INTERNATIONAL WORKSHOP ON ANALYSIS AND MODELING OF FACE AND GESTURES, 2003, : 187 - 194
  • [44] Omni Viewer : Enabling Multi-modal 3D DASH
    Gao, Zhenhuan
    Chen, Shannon
    Nahrstedt, Klara
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 801 - 802
  • [45] Multi-modal 3D Simulation Makes the Impossible Possible
    Ganske, Ingrid M.
    Schulz, Noah
    Livingston, Katie
    Goobie, Susan
    Meara, John G.
    Proctor, Mark
    Weinstock, Peter
    PLASTIC AND RECONSTRUCTIVE SURGERY-GLOBAL OPEN, 2018, 6 (04)
  • [46] A Multi-modal Framework for 3D Facial Animation Control
    Xiao, Qiuyang
    Shi, Chengwei
    Cao, Chong
    PROCEEDINGS OF THE SIGGRAPH 2024 POSTERS, 2024,
  • [47] Adaptive segmentation of multi-modal 3D data using robust level set techniques
    Farag, A
    Hassan, H
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2004, PT 1, PROCEEDINGS, 2004, 3216 : 143 - 150
  • [48] Routing Optimization of Multi-modal Interconnects In 3D ICs
    Lee, Young-Joon
    Lim, Sung Kyu
    2009 IEEE 59TH ELECTRONIC COMPONENTS AND TECHNOLOGY CONFERENCE, VOLS 1-4, 2009, : 32 - 39
  • [49] Incremental Dense Multi-modal 3D Scene Reconstruction
    Miksik, Ondrej
    Amar, Yousef
    Vineet, Vibhav
    Perez, Patrick
    Torr, Philip H. S.
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 908 - 915
  • [50] Multi-Modal 3D Object Detection by Box Matching
    Liu, Zhe
    Ye, Xiaoqing
    Zou, Zhikang
    He, Xinwei
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,