Towards efficient multi-modal 3D object detection: Homogeneous sparse fuse network

被引:1
|
作者
Tang, Yingjuan [1 ]
He, Hongwen [1 ]
Wang, Yong [1 ]
Wu, Jingda [2 ]
机构
[1] Beijing Inst Technol, Sch Mech Engn, Beijing 100081, Peoples R China
[2] Nanyang Technol Univ, Sch Mech & Aerosp Engn, 50 Nanyang Ave, Singapore 639798, Singapore
关键词
Autonomous driving; 3D object detection; Multi-modal; Sparse convolutional networks; Point cloud and image fusion; Homogeneous fusion;
D O I
10.1016/j.eswa.2024.124945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR-only 3D detection methods struggle with the sparsity of point clouds. To overcome this issue, multi- modal methods have been proposed, but their fusion is a challenge due to the heterogeneous representation of images and point clouds. This paper proposes a novel multi-modal framework, Homogeneous Sparse Fusion (HS-Fusion), which generates pseudo point clouds from depth completion. The proposed framework introduces a 3D foreground-aware middle extractor that efficiently extracts high-responding foreground features from sparse point cloud data. This module can be integrated into existing sparse convolutional neural networks. Furthermore, the proposed homogeneous attentive fusion enables cross-modality consistency fusion. Finally, the proposed HS-Fusion can simultaneously combine 2D image features and 3D geometric features of pseudo point clouds using multi-representation feature extraction. The proposed network has been found to attain better performance on the 3D object detection benchmarks. In particular, the proposed model demonstrates a 4.02% improvement in accuracy compared to the pure model. Moreover, its inference speed surpasses that of other models, thus further validating the efficacy of HS-Fusion.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing
    Schierl, Jonathan
    Graehling, Quinn
    Aspiras, Theus
    Asari, Vijay
    Van Rynbach, Andre
    Rabb, Dave
    2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
  • [42] Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition
    Zhang, Zizhao
    Lin, Haojie
    Zhu, Junjie
    Zhao, Xibin
    Gao, Yue
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 38 - 49
  • [43] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
    Chiu, Hsu-kuang
    Lie, Jie
    Ambrus, Rares
    Bohg, Jeannette
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
  • [44] Exploiting Multi-Modal Synergies for Enhancing 3D Multi-Object Tracking
    Xu, Xinglong
    Ren, Weihong
    Chen, Xi'ai
    Fan, Huijie
    Han, Zhi
    Liu, Honghai
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8643 - 8650
  • [45] Generating Adversarial Point Clouds on Multi-modal Fusion Based 3D Object Detection Model
    Wang, Huiying
    Shen, Huixin
    Zhang, Boyang
    Wen, Yu
    Meng, Dan
    INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2021), PT I, 2021, 12918 : 187 - 203
  • [46] GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection
    Song, Ziying
    Wei, Haiyue
    Bai, Lin
    Yang, Lei
    Jia, Caiyan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3335 - 3346
  • [47] Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection
    Li, Hao
    Zhang, Zehan
    Zhao, Xian
    Wang, Yulong
    Shen, Yuxi
    Pu, Shiliang
    Mao, Hui
    COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 532 - 549
  • [48] PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection
    Xie, Guotao
    Chen, Zhiyuan
    Gao, Ming
    Hu, Manjiang
    Qin, Xiaohui
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (06) : 5598 - 5611
  • [49] Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion
    Wu, Xiaopei
    Peng, Liang
    Yang, Honghui
    Xie, Liang
    Huang, Chenxi
    Deng, Chengqi
    Liu, Haifeng
    Cai, Deng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5408 - 5417
  • [50] Anytime 3D Object Reconstruction Using Multi-Modal Variational Autoencoder
    Yu, Hyeonwoo
    Oh, Jean
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2162 - 2169