High-order multilayer attention fusion network for 3D object detection

被引:0
|
作者
Zhang, Baowen [1 ]
Zhao, Yongyong [1 ]
Su, Chengzhi [1 ]
Cao, Guohua [1 ]
机构
[1] Changchun Univ Sci & Technol, Sch Mech & Elect Engn, Changchun, Peoples R China
关键词
attention feature fusion; high-order feature; 3D object detection; point cloud;
D O I
10.1002/eng2.12987
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Three-dimensional object detection based on the fusion of 2D image data and 3D point clouds has become a research hotspot in the field of 3D scene understanding. However, different sensor data have discrepancies in spatial position, scale, and alignment, which severely impact detection performance. Inappropriate fusion methods can lead to the loss and interference of valuable information. Therefore, we propose the High-Order Multi-Level Attention Fusion Network (HMAF-Net), which takes camera images and voxelized point clouds as inputs for 3D object detection. To enhance the expressive power between different modality features, we introduce a high-order feature fusion module that performs multi-level convolution operations on the element-wise summed features. By incorporating filtering and non-linear activation, we extract deep semantic information from the fused multi-modal features. To maximize the effectiveness of the fused salient feature information, we introduce an attention mechanism that dynamically evaluates the importance of pooled features at each level, enabling adaptive weighted fusion of significant and secondary features. To validate the effectiveness of HMAF-Net, we conduct experiments on the KITTI dataset. In the "Car," "Pedestrian," and "Cyclist" categories, HMAF-Net achieves mAP performances of 81.78%, 60.09%, and 63.91%, respectively, demonstrating more stable performance compared to other multi-modal methods. Furthermore, we further evaluate the framework's effectiveness and generalization capability through the KITTI benchmark test, and compare its performance with other published detection methods on the 3D detection benchmark and BEV detection benchmark for the "Car" category, showing excellent results. The code and model will be made available on .
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Deep learning-based fusion networks with high-order attention mechanism for 3D object detection in autonomous driving scenarios
    Jiang, Haiyang
    Lu, Yuanyao
    Zhang, Duona
    Shi, Yuntao
    Wang, Jingxuan
    APPLIED SOFT COMPUTING, 2024, 152
  • [2] Deep learning-based fusion networks with high-order attention mechanism for 3D object detection in autonomous driving scenarios
    Jiang, Haiyang
    Lu, Yuanyao
    Zhang, Duona
    Shi, Yuntao
    Wang, Jingxuan
    Applied Soft Computing, 2024, 152
  • [3] BMFN3D: Bidirectional multilayer fusion network for indoor 3D object detection
    Cheng, Jun
    Zhang, Sheng
    ELECTRONICS LETTERS, 2022, 58 (18) : 696 - 698
  • [4] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    NEUROCOMPUTING, 2021, 437 : 107 - 117
  • [5] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    Neurocomputing, 2021, 437 : 107 - 117
  • [6] EHAN: An explicitly high-order attention network for accurate camouflaged object detection
    Wu, Qingbo
    Wu, Guanxing
    Chen, Shengyong
    NEUROCOMPUTING, 2025, 624
  • [7] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [8] ARPNET: attention region proposal network for 3D object detection
    Yangyang Ye
    Chi Zhang
    Xiaoli Hao
    Science China Information Sciences, 2019, 62
  • [9] Image attention transformer network for indoor 3D object detection
    Ren, Keyan
    Yan, Tong
    Hu, Zhaoxin
    Han, Honggui
    Zhang, Yunlu
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (07) : 2176 - 2190
  • [10] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, (07) : 2176 - 2190