Towards efficient multi-modal 3D object detection: Homogeneous sparse fuse network

被引：1

作者：

Tang, Yingjuan ^{[1
]}

He, Hongwen ^{[1
]}

Wang, Yong ^{[1
]}

Wu, Jingda ^{[2
]}

机构：

[1] Beijing Inst Technol, Sch Mech Engn, Beijing 100081, Peoples R China

[2] Nanyang Technol Univ, Sch Mech & Aerosp Engn, 50 Nanyang Ave, Singapore 639798, Singapore

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 256卷

关键词：

Autonomous driving; 3D object detection; Multi-modal; Sparse convolutional networks; Point cloud and image fusion; Homogeneous fusion;

D O I：

10.1016/j.eswa.2024.124945

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

LiDAR-only 3D detection methods struggle with the sparsity of point clouds. To overcome this issue, multi- modal methods have been proposed, but their fusion is a challenge due to the heterogeneous representation of images and point clouds. This paper proposes a novel multi-modal framework, Homogeneous Sparse Fusion (HS-Fusion), which generates pseudo point clouds from depth completion. The proposed framework introduces a 3D foreground-aware middle extractor that efficiently extracts high-responding foreground features from sparse point cloud data. This module can be integrated into existing sparse convolutional neural networks. Furthermore, the proposed homogeneous attentive fusion enables cross-modality consistency fusion. Finally, the proposed HS-Fusion can simultaneously combine 2D image features and 3D geometric features of pseudo point clouds using multi-representation feature extraction. The proposed network has been found to attain better performance on the 3D object detection benchmarks. In particular, the proposed model demonstrates a 4.02% improvement in accuracy compared to the pure model. Moreover, its inference speed surpasses that of other models, thus further validating the efficacy of HS-Fusion.

引用

页数：12

共 50 条

[41] Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing
Schierl, Jonathan
Graehling, Quinn
Aspiras, Theus
Asari, Vijay
Van Rynbach, Andre
Rabb, Dave
2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
[42] Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition
Zhang, Zizhao
Lin, Haojie
Zhu, Junjie
Zhao, Xibin
Gao, Yue
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 38 - 49
[43] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
Chiu, Hsu-kuang
Lie, Jie
Ambrus, Rares
Bohg, Jeannette
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
[44] Exploiting Multi-Modal Synergies for Enhancing 3D Multi-Object Tracking
Xu, Xinglong
Ren, Weihong
Chen, Xi'ai
Fan, Huijie
Han, Zhi
Liu, Honghai
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8643 - 8650
[45] Generating Adversarial Point Clouds on Multi-modal Fusion Based 3D Object Detection Model
Wang, Huiying
Shen, Huixin
Zhang, Boyang
Wen, Yu
Meng, Dan
INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2021), PT I, 2021, 12918 : 187 - 203
[46] GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection
Song, Ziying
Wei, Haiyue
Bai, Lin
Yang, Lei
Jia, Caiyan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3335 - 3346
[47] Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection
Li, Hao
Zhang, Zehan
Zhao, Xian
Wang, Yulong
Shen, Yuxi
Pu, Shiliang
Mao, Hui
COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 532 - 549
[48] PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection
Xie, Guotao
Chen, Zhiyuan
Gao, Ming
Hu, Manjiang
Qin, Xiaohui
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (06) : 5598 - 5611
[49] Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion
Wu, Xiaopei
Peng, Liang
Yang, Honghui
Xie, Liang
Huang, Chenxi
Deng, Chengqi
Liu, Haifeng
Cai, Deng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5408 - 5417
[50] Anytime 3D Object Reconstruction Using Multi-Modal Variational Autoencoder
Yu, Hyeonwoo
Oh, Jean
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2162 - 2169

← 1 2 3 4 5 →