MVX-Net: Multimodal VoxelNet for 3D Object Detection

被引:0
|
作者
Sindagi, Vishwanath A. [1 ]
Zhou, Yin [2 ]
Tuzel, Oncel [2 ]
机构
[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[2] Apple Inc, AI Res, Cupertino, CA 95014 USA
关键词
REPRESENTATION;
D O I
10.1109/icra.2019.8794195
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many recent works on 3D object detection have focused on designing neural network architectures that can consume point cloud data. While these approaches demonstrate encouraging performance, they are typically based on a single modality and are unable to leverage information from other modalities, such as a camera. Although a few approaches fuse data from different modalities, these methods either use a complicated pipeline to process the modalities sequentially, or perform late-fusion and are unable to learn interaction between different modalities at early stages. In this work, we present PointFusion and VoxelFusion: two simple yet effective early-fusion approaches to combine the RGB and point cloud modalities, by leveraging the recently introduced VoxelNet architecture. Evaluation on the KITTI dataset demonstrates significant improvements in performance over approaches which only use point cloud data. Furthermore, the proposed method provides results competitive with the state-of-the-art multimodal algorithms, achieving top-2 ranking in five of the six birds eye view and 3D detection categories on the KITTI benchmark, by using a simple single stage network.
引用
收藏
页码:7276 / 7282
页数:7
相关论文
共 50 条
  • [31] Multimodal Cooperative 3D Object Detection Over Connected Vehicles for Autonomous Driving
    Chi, Fangyuan
    Wang, Yixiao
    Pourazad, Mahsa T.
    Nasiopoulos, Panos
    Leung, Victor C. M.
    IEEE NETWORK, 2023, 37 (04): : 265 - 272
  • [32] Emerging Trends in Autonomous Vehicle Perception: Multimodal Fusion for 3D Object Detection
    Alaba, Simegnew Yihunie
    Gurbuz, Ali C.
    Ball, John E.
    WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (01):
  • [33] A Multimodal 3D Object Detection Method Based on Double-Fusion Framework
    Ge T.-A.
    Li H.
    Guo Y.
    Wang J.-Y.
    Zhou D.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3100 - 3110
  • [34] DMFF: dual-way multimodal feature fusion for 3D object detection
    Xiaopeng Dong
    Xiaoguang Di
    Wenzhuang Wang
    Signal, Image and Video Processing, 2024, 18 (1) : 455 - 463
  • [35] SEG-VoxelNet for 3D Vehicle Detection from RGB and LiDAR Data
    Dou, Jian
    Xue, Jianru
    Fang, Jianwu
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 4362 - 4368
  • [36] Density-Net: A Density-Aware Network for 3D Object Detection
    Li, Hongyu
    Guo, Youhui
    Zhou, Yu
    Wang, Weiping
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 1105 - 1112
  • [37] RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
    Sun, Pei
    Wang, Weiyue
    Chai, Yuning
    Elsayed, Gamaleldin
    Bewley, Alex
    Zhang, Xiao
    Sminchisescu, Cristian
    Anguelov, Dragomir
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5721 - 5730
  • [38] 3D Object Detection with Pointformer
    Pan, Xuran
    Xia, Zhuofan
    Song, Shiji
    Li, Li Erran
    Huang, Gao
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7459 - 7468
  • [39] A survey of 3D object detection
    Wei Liang
    Pengfei Xu
    Ling Guo
    Heng Bai
    Yang Zhou
    Feng Chen
    Multimedia Tools and Applications, 2021, 80 : 29617 - 29641
  • [40] A survey of 3D object detection
    Liang, Wei
    Xu, Pengfei
    Guo, Ling
    Bai, Heng
    Zhou, Yang
    Chen, Feng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 29617 - 29641