Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges

被引:705
|
作者
Feng, Di [1 ,2 ]
Haase-Schutz, Christian [3 ,4 ]
Rosenbaum, Lars [1 ]
Hertlein, Heinz [3 ]
Glaser, Claudius [1 ]
Timm, Fabian [1 ]
Wiesbeck, Werner [4 ]
Dietmayer, Klaus [2 ]
机构
[1] Robert Bosch GmbH, Corp Res, Driver Assistance Syst & Automated Driving, D-71272 Renningen, Germany
[2] Ulm Univ, Inst Measurement Control & Microtechnol, D-89081 Ulm, Germany
[3] Robert Bosch GmbH, Chassis Syst Control, Engn Cognit Syst, Automated Driving, D-74232 Abstatt, Germany
[4] Karlsruhe Inst Technol, Inst Radio Frequency Engn & Elect, D-76131 Karlsruhe, Germany
关键词
Multi-modality; object detection; semantic segmentation; deep learning; autonomous driving; NEURAL-NETWORKS; ROAD; FUSION; LIDAR; ENVIRONMENTS; SET;
D O I
10.1109/TITS.2020.2972974
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Recent advancements in perception for autonomous driving are driven by deep learning. In order to achieve robust and accurate scene understanding, autonomous vehicles are usually equipped with different sensors (e.g. cameras, LiDARs, Radars), and multiple sensing modalities can be fused to exploit their complementary properties. In this context, many methods have been proposed for deep multi-modal perception problems. However, there is no general guideline for network architecture design, and questions of "what to fuse", "when to fuse", and "how to fuse" remain open. This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving. To this end, we first provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection and semantic segmentation in autonomous driving research. We then summarize the fusion methodologies and discuss challenges and open questions. In the appendix, we provide tables that summarize topics and methods. We also provide an interactive online platform to navigate each reference: https://boschresearch.github.io/multimodalperception/.
引用
收藏
页码:1341 / 1360
页数:20
相关论文
共 50 条
  • [41] Exploiting Multi-Modal Fusion for Urban Autonomous Driving Using Latent Deep Reinforcement Learning
    Khalil, Yasser H.
    Mouftah, Hussein T.
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (03) : 2921 - 2935
  • [42] Object detection in autonomous driving - from large to small datasets
    Iancu, David-Traian
    Sorici, Alexandru
    Florea, Adina Magda
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTERS AND ARTIFICIAL INTELLIGENCE (ECAI-2019), 2019,
  • [43] Application of Multi-modal Fusion Attention Mechanism in Semantic Segmentation
    Liu, Yunlong
    Yoshie, Osamu
    Watanabe, Hiroshi
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 378 - 397
  • [44] Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review
    Yao, Shanliang
    Guan, Runwei
    Huang, Xiaoyu
    Li, Zhuoxiao
    Sha, Xiangyu
    Yue, Yong
    Lim, Eng Gee
    Seo, Hyungjoon
    Man, Ka Lok
    Zhu, Xiaohui
    Yue, Yutao
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 2094 - 2128
  • [45] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PATTERN RECOGNITION, 2023, 137
  • [46] Multi-modal Prototypes for Open-World Semantic Segmentation
    Yang, Yuhuan
    Ma, Chaofan
    Ju, Chen
    Zhang, Fei
    Yao, Jiangchao
    Zhang, Ya
    Wang, Yanfeng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6004 - 6020
  • [47] Semantic Segmentation of Defects in Infrastructures through Multi-modal Images
    Shahsavarani, Sara
    Lopez, Fernando
    Ibarra-Castanedo, Clemente
    Maldague, Xavier P., V
    THERMOSENSE: THERMAL INFRARED APPLICATIONS XLVI, 2024, 13047
  • [48] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [49] Ticino: A multi-modal remote sensing dataset for semantic segmentation
    Barbato, Mirko Paolo
    Piccoli, Flavio
    Napoletano, Paolo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [50] Multi-Modal Sensor Fusion and Object Tracking for Autonomous Racing
    Karle, Phillip
    Fent, Felix
    Huch, Sebastian
    Sauerbeck, Florian
    Lienkamp, Markus
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3871 - 3883