MENet: Multi-Modal Mapping Enhancement Network for 3D Object Detection in Autonomous Driving

被引:5
|
作者
Liu, Moyun [1 ,2 ,3 ]
Chen, Youping [1 ]
Xie, Jingming [1 ]
Zhu, Yijie [1 ]
Zhang, Yang [4 ,5 ]
Yao, Lei [6 ]
Bing, Zhenshan [7 ]
Zhuang, Genghang [7 ]
Huang, Kai [8 ]
Zhou, Joey Tianyi [2 ,3 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan 430074, Peoples R China
[2] ASTAR, IHPC, Singapore 138632, Singapore
[3] ASTAR, CFAR, Singapore 138632, Singapore
[4] Hubei Univ Technol, Sch Mech Engn, Wuhan 430068, Peoples R China
[5] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
[6] Hong Kong Polytech Univ, Dept Elect & Elect Engn, Hong Kong, Peoples R China
[7] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany
[8] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
关键词
Multi-modal fusion; 3D object detection; mapping enhancement; autonomous driving;
D O I
10.1109/TITS.2024.3387398
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
To achieve more accurate perception performance, LiDAR and camera are gradually chosen to improve 3D object detection simultaneously. However, it is still a non-trivial task to build an effective fusion mechanism, and this is hindering the development of multi-modal based method. Especially, the mapping relationship construction between two modalities is far from fully explored. Canonical cross-modal mapping suffers from failure when the calibration matrix is incorrect, and it also greatly wastes the amount and density of RGB image information. This paper aims to extend the traditional one-to-one alignment relationship between LiDAR and camera. For all projected point clouds, we enhance their cross-modal mapping relationship through aggregating color-texture related feature and shape-contour related feature. Further, a mapping pyramid is proposed to leverage the semantic representation of the image feature at different stages. Based on the above mapping enhancement strategies, our method increases the engagement rate of image. Finally, we design a fusion module based on an attention mechanism to improve the point cloud feature with the auxiliary image feature. Extensive experiments on the KITTI dataset and SUN-RGBD dataset show that our model achieves satisfactory 3D object detection, especially for categories with sparse point clouds compared with other multi-modal fusion networks.
引用
收藏
页码:9397 / 9410
页数:14
相关论文
共 50 条
  • [1] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Wang, Yingjie
    Mao, Qiuyu
    Zhu, Hanqi
    Deng, Jiajun
    Zhang, Yu
    Ji, Jianmin
    Li, Houqiang
    Zhang, Yanyong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 2122 - 2152
  • [2] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Yingjie Wang
    Qiuyu Mao
    Hanqi Zhu
    Jiajun Deng
    Yu Zhang
    Jianmin Ji
    Houqiang Li
    Yanyong Zhang
    International Journal of Computer Vision, 2023, 131 : 2122 - 2152
  • [3] Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
    Khamsehashari, Razieh
    Schill, Kerstin
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 263 - 267
  • [4] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey and Taxonomy
    Wang, Li
    Zhang, Xinyu
    Song, Ziying
    Bi, Jiangfeng
    Zhang, Guoxin
    Wei, Haiyue
    Tang, Liyao
    Yang, Lei
    Li, Jun
    Jia, Caiyan
    Zhao, Lijun
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (07): : 3781 - 3798
  • [5] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
    Chiu, Hsu-kuang
    Lie, Jie
    Ambrus, Rares
    Bohg, Jeannette
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
  • [6] Deep Multi-modal Object Detection for Autonomous Driving
    Ennajar, Amal
    Khouja, Nadia
    Boutteau, Remi
    Tlili, Fethi
    2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 7 - 11
  • [7] Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization
    Rollo, Federico
    Raiola, Gennaro
    Zunino, Andrea
    Tsagarakis, Nikolaos
    Ajoudani, Arash
    2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 90 - 97
  • [8] Multi-Modal Streaming 3D Object Detection
    Abdelfattah, Mazen
    Yuan, Kaiwen
    Wang, Z. Jane
    Ward, Rabab
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6163 - 6170
  • [9] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641
  • [10] Deformable Feature Fusion Network for Multi-Modal 3D Object Detection
    Guo, Kun
    Gan, Tong
    Ding, Zhao
    Ling, Qiang
    2024 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, ARTIFICIAL INTELLIGENCE AND INTELLIGENT CONTROL, RAIIC 2024, 2024, : 363 - 367