High-order multilayer attention fusion network for 3D object detection

被引:0
|
作者
Zhang, Baowen [1 ]
Zhao, Yongyong [1 ]
Su, Chengzhi [1 ]
Cao, Guohua [1 ]
机构
[1] Changchun Univ Sci & Technol, Sch Mech & Elect Engn, Changchun, Peoples R China
关键词
attention feature fusion; high-order feature; 3D object detection; point cloud;
D O I
10.1002/eng2.12987
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Three-dimensional object detection based on the fusion of 2D image data and 3D point clouds has become a research hotspot in the field of 3D scene understanding. However, different sensor data have discrepancies in spatial position, scale, and alignment, which severely impact detection performance. Inappropriate fusion methods can lead to the loss and interference of valuable information. Therefore, we propose the High-Order Multi-Level Attention Fusion Network (HMAF-Net), which takes camera images and voxelized point clouds as inputs for 3D object detection. To enhance the expressive power between different modality features, we introduce a high-order feature fusion module that performs multi-level convolution operations on the element-wise summed features. By incorporating filtering and non-linear activation, we extract deep semantic information from the fused multi-modal features. To maximize the effectiveness of the fused salient feature information, we introduce an attention mechanism that dynamically evaluates the importance of pooled features at each level, enabling adaptive weighted fusion of significant and secondary features. To validate the effectiveness of HMAF-Net, we conduct experiments on the KITTI dataset. In the "Car," "Pedestrian," and "Cyclist" categories, HMAF-Net achieves mAP performances of 81.78%, 60.09%, and 63.91%, respectively, demonstrating more stable performance compared to other multi-modal methods. Furthermore, we further evaluate the framework's effectiveness and generalization capability through the KITTI benchmark test, and compare its performance with other published detection methods on the 3D detection benchmark and BEV detection benchmark for the "Car" category, showing excellent results. The code and model will be made available on .
引用
收藏
页数:14
相关论文
共 50 条
  • [31] SGF3D: Similarity-guided fusion network for 3D object detection
    Li, Chunzheng
    Wang, Gaihua
    Long, Qian
    Zhou, Zhengshu
    IMAGE AND VISION COMPUTING, 2024, 142
  • [32] SGF3D: Similarity-guided fusion network for 3D object detection
    Li, Chunzheng
    Wang, Gaihua
    Long, Qian
    Zhou, Zhengshu
    Image and Vision Computing, 2024, 142
  • [33] Point-Level Fusion and Channel Attention for 3D Object Detection in Autonomous Driving
    Shen, Juntao
    Fang, Zheng
    Huang, Jin
    SENSORS, 2025, 25 (04)
  • [34] Dense Voxel Fusion for 3D Object Detection
    Mahmoud, Anas
    Hu, Jordan S. K.
    Waslander, Steven L.
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 663 - 672
  • [35] PointPainting: Sequential Fusion for 3D Object Detection
    Vora, Sourabh
    Lang, Alex H.
    Helou, Bassam
    Beijbom, Oscar
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4603 - 4611
  • [36] Dense projection fusion for 3D object detection
    Chen, Zhao
    Hu, Bin-Jie
    Luo, Chengxi
    Chen, Guohao
    Zhu, Haohui
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [37] Sparse Dense Fusion for 3D Object Detection
    Gao, Yulu
    Sima, Chonghao
    Shi, Shaoshuai
    Di, Shangzhe
    Liu, Si
    Li, Hongyang
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10939 - 10946
  • [38] Voxel Field Fusion for 3D Object Detection
    Li, Yanwei
    Qi, Xiaojuan
    Chen, Yukang
    Wang, Liwei
    Li, Zeming
    Sun, Jian
    Jia, Jiaya
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1110 - 1119
  • [39] Fully Sparse Fusion for 3D Object Detection
    Li, Yingyan
    Fan, Lue
    Liu, Yang
    Huang, Zehao
    Chen, Yuntao
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231
  • [40] Radar Voxel Fusion for 3D Object Detection
    Nobis, Felix
    Shafiei, Ehsan
    Karle, Phillip
    Betz, Johannes
    Lienkamp, Markus
    APPLIED SCIENCES-BASEL, 2021, 11 (12):