DMFF: dual-way multimodal feature fusion for 3D object detection

被引：0

作者：

Dong, Xiaopeng ^{[1
]}

Di, Xiaoguang ^{[1
]}

Wang, Wenzhuang ^{[1
]}

机构：

[1] Harbin Inst Technol, Control & Simulat Ctr, Harbin, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 01期

基金：

黑龙江省自然科学基金;

关键词：

3D object detection; Multimodal feature fusion; Self-attention mechanism; Lidar point clouds; RGB images;

D O I：

10.1007/s11760-023-02772-z

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recently, multimodal 3D object detection that fuses the complementary information from LiDAR data and RGB images has been an active research topic. However, it is not trivial to fuse images and point clouds because of different representations of them. Inadequate feature fusion also brings bad effects on detection performance. We convert images into pseudo point clouds by using a depth completion and utilize a more efficient feature fusion method to address the problems. In this paper, we propose a dual-way multimodal feature fusion network (DMFF) for 3D object detection. Specifically, we first use a dual stream feature extraction module (DSFE) to generate homogeneous LiDAR and pseudo region of interest (RoI) features. Then, we propose a dual-way feature interaction method (DWFI) that enables intermodal and intramodal interaction of the two features. Next, we design a local attention feature fusion module (LAFF) to select which features of the input are more likely to contribute to the desired output. In addition, the proposed DMFF achieves the state-of-the-art performances on the KITTI Dataset.

引用

页码：455 / 463

页数：9

共 50 条

[41] Voxel Field Fusion for 3D Object Detection
Li, Yanwei
Qi, Xiaojuan
Chen, Yukang
Wang, Liwei
Li, Zeming
Sun, Jian
Jia, Jiaya
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1110 - 1119
[42] Fully Sparse Fusion for 3D Object Detection
Li, Yingyan
Fan, Lue
Liu, Yang
Huang, Zehao
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231
[43] Radar Voxel Fusion for 3D Object Detection
Nobis, Felix
Shafiei, Ehsan
Karle, Phillip
Betz, Johannes
Lienkamp, Markus
APPLIED SCIENCES-BASEL, 2021, 11 (12):
[44] A vegetation classification method based on improved dual-way branch feature fusion U-net
Yu, Huiling
Jiang, Dapeng
Peng, Xiwen
Zhang, Yizhuo
FRONTIERS IN PLANT SCIENCE, 2022, 13
[45] Virtual Sparse Convolution for Multimodal 3D Object Detection
Wu, Hai
Wen, Chenglu
Shi, Shaoshuai
Li, Xin
Wang, Cheng
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
[46] Multimodal 3D Object Detection from Simulated Pretraining
Brekke, Asmund
Vatsendvik, Fredrik
Lindseth, Frank
NORDIC ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 1056 : 102 - 113
[47] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
Zuo, Liangyu
Li, Yaochen
Han, Mengtao
Li, Qiao
Liu, Yuehu
2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
[48] Multimodal Transformer for Automatic 3D Annotation and Object Detection
Liu, Chang
Qian, Xiaoyan
Huang, Binxiao
Qi, Xiaojuan
Lam, Edmund
Tan, Siew-Chong
Wong, Ngai
COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
[49] Sparse Embedded Convolution Based Dual Feature Aggregation 3D Object Detection Network
Li, Hai-Sheng
Lu, Yan-Ling
NEURAL PROCESSING LETTERS, 2024, 56 (01)
[50] DVFENet: Dual-branch voxel feature extraction network for 3D object detection
He, Yunqian
Xia, Guihua
Luo, Yongkang
Su, Li
Zhang, Zhi
Li, Wanyi
Wang, Peng
NEUROCOMPUTING, 2021, 459 : 201 - 211

← 1 2 3 4 5 →