PCDR-DFF: multi-modal 3D object detection based on point cloud diversity representation and dual feature fusion

被引:0
|
作者
Xia, Chenxing [1 ,2 ,3 ]
Li, Xubing [1 ]
Gao, Xiuju [4 ]
Ge, Bin [1 ]
Li, Kuan-Ching [5 ]
Fang, Xianjin [1 ,6 ]
Zhang, Yan [7 ]
Yang, Ke [2 ]
机构
[1] Anhui Univ Sci & Technol, Coll Comp Sci & Engn, Huainan 232001, Peoples R China
[2] Inst Energy, Hefei Comprehens Natl Sci Ctr, Hefei, Anhui, Peoples R China
[3] Anhui Purvar Bigdata Technol Co Ltd, Huainan 232001, Peoples R China
[4] Anhui Univ Sci & Technol, Coll Elect & Informat Engn, Huainan, Anhui, Peoples R China
[5] Providence Univ, Dept Comp Sci & Informat Engn, Taichung, Taiwan
[6] Inst Artificial Intelligence, Hefei Comprehens Natl Sci Ctr, Hefei, Peoples R China
[7] Anhui Univ, Sch Elect & Informat Engn, Hefei, Anhui, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 16期
基金
中国国家自然科学基金;
关键词
3D Object detection; Graph neural networks; Multi-modal; Point cloud;
D O I
10.1007/s00521-024-09561-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, multi-modal 3D object detection techniques based on point clouds and images have received increasing attention. However, existing methods for multi-modal feature fusion are often relatively singular, and single point cloud representation methods also have some limitations. For example, voxelization may result in the loss of fine-grained information, while 2D images lack depth information, which can restrict the accuracy of detection. Therefore, in this work, we propose a novel method for multi-modal 3D object detection based on point cloud diversity representation and dual feature fusion, PCDR-DFF, to improve the prediction accuracy of 3D object detection. Firstly, point clouds are projected to the image coordinate system and extract multi-level features of the point cloud corresponding to the image using a 2D backbone network. Then, the point clouds are jointly characterized using graphs and pillars, and the 3D features of the point clouds are extracted using graph neural networks and residual connectivity. Finally, a dual feature fusion method is designed to improve the accuracy of detection with the help of a well-designed multi-point fusion model and multi-feature fusion mechanism embedded with a spare 3D-U Net. Extensive experiments on the KITTI dataset demonstrate the effectiveness and competitiveness of our proposed models in comparison with other methods.
引用
收藏
页码:9329 / 9346
页数:18
相关论文
共 50 条
  • [21] MLF3D: Multi-Level Fusion for Multi-Modal 3D Object Detection
    Jiang, Han
    Wang, Jianbin
    Xiao, Jianru
    Zhao, Yanan
    Chen, Wanqing
    Ren, Yilong
    Yu, Haiyang
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1588 - 1593
  • [22] A Multi-Modal Fusion-Based 3D Multi-Object Tracking Framework With Joint Detection
    Wang, Xiyang
    Fu, Chunyun
    He, Jiawei
    Huang, Mingguang
    Meng, Ting
    Zhang, Siyu
    Zhou, Hangning
    Xu, Ziyao
    Zhang, Chi
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 532 - 539
  • [23] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
    Li, Jiahao
    Chen, Lingshan
    Li, Zhen
    IEEE ACCESS, 2025, 13 : 52385 - 52396
  • [24] Enhancing 3D object detection through multi-modal fusion for cooperative perception
    Xia, Bin
    Zhou, Jun
    Kong, Fanyu
    You, Yuhe
    Yang, Jiarui
    Lin, Lin
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 104 : 46 - 55
  • [25] 3D Object Detection Method with Image Semantic Feature Guidance and Cross-Modal Fusion of Point Cloud
    Li, Hui
    Wang, Junyin
    Cheng, Yuanzhi
    Liu, Jian
    Zhao, Guowei
    Chen, Shuangmin
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (05): : 734 - 749
  • [26] Multi-Modal 3D Object Detection by Box Matching
    Liu, Zhe
    Ye, Xiaoqing
    Zou, Zhikang
    He, Xinwei
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [27] Unlocking the power of multi-modal fusion in 3D object tracking
    Hu, Yue
    IET COMPUTER VISION, 2025, 19 (01)
  • [28] GNN-based Point Cloud Maps Feature Extraction and Residual Feature Fusion for 3D Object Detection
    Liao, Wei-Hsiang
    Wang, Chieh-Chih
    Lin, Wen-Chieh
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 7010 - 7016
  • [29] GraphBEV: Towards Robust BEV Feature Alignment for Multi-modal 3D Object Detection
    Song, Ziying
    Yang, Lei
    Xu, Shaoqing
    Liu, Lin
    Xu, Dongyang
    Jia, Caiyan
    Jia, Feiyang
    Wang, Li
    COMPUTER VISION - ECCV 2024, PT XXVI, 2025, 15084 : 347 - 366
  • [30] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    Zhou, Bolei
    Zhao, Hang
    PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, 2022, : 827 - 833