Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges

被引：705

作者：

Feng, Di ^{[1
,2
]}

Haase-Schutz, Christian ^{[3
,4
]}

Rosenbaum, Lars ^{[1
]}

Hertlein, Heinz ^{[3
]}

Glaser, Claudius ^{[1
]}

Timm, Fabian ^{[1
]}

Wiesbeck, Werner ^{[4
]}

Dietmayer, Klaus ^{[2
]}

机构：

[1] Robert Bosch GmbH, Corp Res, Driver Assistance Syst & Automated Driving, D-71272 Renningen, Germany

[2] Ulm Univ, Inst Measurement Control & Microtechnol, D-89081 Ulm, Germany

[3] Robert Bosch GmbH, Chassis Syst Control, Engn Cognit Syst, Automated Driving, D-74232 Abstatt, Germany

[4] Karlsruhe Inst Technol, Inst Radio Frequency Engn & Elect, D-76131 Karlsruhe, Germany

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2021年 / 22卷 / 03期

关键词：

Multi-modality; object detection; semantic segmentation; deep learning; autonomous driving; NEURAL-NETWORKS; ROAD; FUSION; LIDAR; ENVIRONMENTS; SET;

D O I：

10.1109/TITS.2020.2972974

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Recent advancements in perception for autonomous driving are driven by deep learning. In order to achieve robust and accurate scene understanding, autonomous vehicles are usually equipped with different sensors (e.g. cameras, LiDARs, Radars), and multiple sensing modalities can be fused to exploit their complementary properties. In this context, many methods have been proposed for deep multi-modal perception problems. However, there is no general guideline for network architecture design, and questions of "what to fuse", "when to fuse", and "how to fuse" remain open. This review paper attempts to systematically summarize methodologies and discuss challenges for deep multi-modal object detection and semantic segmentation in autonomous driving. To this end, we first provide an overview of on-board sensors on test vehicles, open datasets, and background information for object detection and semantic segmentation in autonomous driving research. We then summarize the fusion methodologies and discuss challenges and open questions. In the appendix, we provide tables that summarize topics and methods. We also provide an interactive online platform to navigate each reference: https://boschresearch.github.io/multimodalperception/.

引用

页码：1341 / 1360

页数：20

共 50 条

[41] Exploiting Multi-Modal Fusion for Urban Autonomous Driving Using Latent Deep Reinforcement Learning
Khalil, Yasser H.
Mouftah, Hussein T.
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (03) : 2921 - 2935
[42] Object detection in autonomous driving - from large to small datasets
Iancu, David-Traian
Sorici, Alexandru
Florea, Adina Magda
PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTERS AND ARTIFICIAL INTELLIGENCE (ECAI-2019), 2019,
[43] Application of Multi-modal Fusion Attention Mechanism in Semantic Segmentation
Liu, Yunlong
Yoshie, Osamu
Watanabe, Hiroshi
COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 378 - 397
[44] Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review
Yao, Shanliang
Guan, Runwei
Huang, Xiaoyu
Li, Zhuoxiao
Sha, Xiangyu
Yue, Yong
Lim, Eng Gee
Seo, Hyungjoon
Man, Ka Lok
Zhu, Xiaohui
Yue, Yutao
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 2094 - 2128
[45] Multi-modal unsupervised domain adaptation for semantic image segmentation
Hu, Sijie
Bonardi, Fabien
Bouchafa, Samia
Sidibe, Desire
PATTERN RECOGNITION, 2023, 137
[46] Multi-modal Prototypes for Open-World Semantic Segmentation
Yang, Yuhuan
Ma, Chaofan
Ju, Chen
Zhang, Fei
Yao, Jiangchao
Zhang, Ya
Wang, Yanfeng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6004 - 6020
[47] Semantic Segmentation of Defects in Infrastructures through Multi-modal Images
Shahsavarani, Sara
Lopez, Fernando
Ibarra-Castanedo, Clemente
Maldague, Xavier P., V
THERMOSENSE: THERMAL INFRARED APPLICATIONS XLVI, 2024, 13047
[48] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
Wang, Li
Zhang, Xinyu
Li, Jun
Xv, Baowei
Fu, Rong
Chen, Haifeng
Yang, Lei
Jin, Dafeng
Zhao, Lijun
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
[49] Ticino: A multi-modal remote sensing dataset for semantic segmentation
Barbato, Mirko Paolo
Piccoli, Flavio
Napoletano, Paolo
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[50] Multi-Modal Sensor Fusion and Object Tracking for Autonomous Racing
Karle, Phillip
Fent, Felix
Huch, Sebastian
Sauerbeck, Florian
Lienkamp, Markus
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3871 - 3883

← 1 2 3 4 5 →