MENet: Multi-Modal Mapping Enhancement Network for 3D Object Detection in Autonomous Driving

被引：5

作者：

Liu, Moyun ^{[1
,2
,3
]}

Chen, Youping ^{[1
]}

Xie, Jingming ^{[1
]}

Zhu, Yijie ^{[1
]}

Zhang, Yang ^{[4
,5
]}

Yao, Lei ^{[6
]}

Bing, Zhenshan ^{[7
]}

Zhuang, Genghang ^{[7
]}

Huang, Kai ^{[8
]}

Zhou, Joey Tianyi ^{[2
,3
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan 430074, Peoples R China

[2] ASTAR, IHPC, Singapore 138632, Singapore

[3] ASTAR, CFAR, Singapore 138632, Singapore

[4] Hubei Univ Technol, Sch Mech Engn, Wuhan 430068, Peoples R China

[5] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

[6] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China

[7] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany

[8] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 08期

关键词：

Multi-modal fusion; 3D object detection; mapping enhancement; autonomous driving;

D O I：

10.1109/TITS.2024.3387398

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

To achieve more accurate perception performance, LiDAR and camera are gradually chosen to improve 3D object detection simultaneously. However, it is still a non-trivial task to build an effective fusion mechanism, and this is hindering the development of multi-modal based method. Especially, the mapping relationship construction between two modalities is far from fully explored. Canonical cross-modal mapping suffers from failure when the calibration matrix is incorrect, and it also greatly wastes the amount and density of RGB image information. This paper aims to extend the traditional one-to-one alignment relationship between LiDAR and camera. For all projected point clouds, we enhance their cross-modal mapping relationship through aggregating color-texture related feature and shape-contour related feature. Further, a mapping pyramid is proposed to leverage the semantic representation of the image feature at different stages. Based on the above mapping enhancement strategies, our method increases the engagement rate of image. Finally, we design a fusion module based on an attention mechanism to improve the point cloud feature with the auxiliary image feature. Extensive experiments on the KITTI dataset and SUN-RGBD dataset show that our model achieves satisfactory 3D object detection, especially for categories with sparse point clouds compared with other multi-modal fusion networks.

引用

页码：9397 / 9410

页数：14

共 50 条

[1] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
Wang, Yingjie
Mao, Qiuyu
Zhu, Hanqi
Deng, Jiajun
Zhang, Yu
Ji, Jianmin
Li, Houqiang
Zhang, Yanyong
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 2122 - 2152
[2] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
Yingjie Wang
Qiuyu Mao
Hanqi Zhu
Jiajun Deng
Yu Zhang
Jianmin Ji
Houqiang Li
Yanyong Zhang
International Journal of Computer Vision, 2023, 131 : 2122 - 2152
[3] Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
Khamsehashari, Razieh
Schill, Kerstin
2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 263 - 267
[4] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy
Wang, Li
Zhang, Xinyu
Song, Ziying
Bi, Jiangfeng
Zhang, Guoxin
Wei, Haiyue
Tang, Liyao
Yang, Lei
Li, Jun
Jia, Caiyan
Zhao, Lijun
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3781 - 3798
[5] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
Chiu, Hsu-kuang
Lie, Jie
Ambrus, Rares
Bohg, Jeannette
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
[6] Deep Multi-modal Object Detection for Autonomous Driving
Ennajar, Amal
Khouja, Nadia
Boutteau, Remi
Tlili, Fethi
2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 7 - 11
[7] Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization
Rollo, Federico
Raiola, Gennaro
Zunino, Andrea
Tsagarakis, Nikolaos
Ajoudani, Arash
2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 90 - 97
[8] Multi-Modal Streaming 3D Object Detection
Abdelfattah, Mazen
Yuan, Kaiwen
Wang, Z. Jane
Ward, Rabab
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6163 - 6170
[9] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
Wang, Li
Zhang, Xinyu
Li, Jun
Xv, Baowei
Fu, Rong
Chen, Haifeng
Yang, Lei
Jin, Dafeng
Zhao, Lijun
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
[10] Deformable Feature Fusion Network for Multi-Modal 3D Object Detection
Guo, Kun
Gan, Tong
Ding, Zhao
Ling, Qiang
2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024, 2024, : 363 - 367

← 1 2 3 4 5 →