Deformable Feature Fusion Network for Multi-Modal 3D Object Detection

被引:0
|
作者
Guo, Kun [1 ]
Gan, Tong [2 ]
Ding, Zhao [3 ]
Ling, Qiang [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei, Peoples R China
[2] Anhui ShineAuto Autonomous Driving Technol Co Ltd, Res & Dev Dept, Hefei, Peoples R China
[3] Anhui JiangHuai Automobile Grp Co Ltd, Inst Intelligent & Networked Automobile, Hefei, Peoples R China
关键词
3D object detection; multi-modal fusion; feature alignment; VOXELNET;
D O I
10.1109/RAIIC61787.2024.10670940
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
LiDAR and cameras are two widely used sensors in 3D object detection. LiDAR point clouds show geometry knowledge of objects, while RGB images provide semantic information, such as color and texture. How to effectively fuse their features is the key to improving detection performance. This paper proposes a Deformable Feature Fusion Network, which performs LiDAR-camera fusion in a flexible way. We present multi-modal features in the bird's-eye view(BEV), and build a Deformable-Attention Fusion(DAF) module to conduct feature fusion. Besides fusion methods, feature alignment is also important in multi-modal detection. Data augmentation of point clouds may change the projection relationship between RGB images and LiDAR point clouds and causes feature misalignment. We introduce a Feature Alignment Transform(FAT) module and alleviate the problem without introducing any trainable parameters. We conduct experiments on the KITTI dataset to evaluate the effectiveness of proposed modules and the experiment results show that our method outperforms most existing methods.
引用
收藏
页码:363 / 367
页数:5
相关论文
共 50 条
  • [1] Dual-domain deformable feature fusion for multi-modal 3D object detection
    Wang, Shihao
    Deng, Tao
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [2] Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 628 - 644
  • [3] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [4] Multi-modal feature fusion for 3D object detection in the production workshop
    Hou, Rui
    Chen, Guangzhu
    Han, Yinhe
    Tang, Zaizuo
    Ru, Qingjun
    APPLIED SOFT COMPUTING, 2022, 115
  • [5] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
    Li, Jiahao
    Chen, Lingshan
    Li, Zhen
    IEEE ACCESS, 2025, 13 : 52385 - 52396
  • [6] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
    Zuo, Liangyu
    Li, Yaochen
    Han, Mengtao
    Li, Qiao
    Liu, Yuehu
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
  • [7] MULTI-MODAL FEATURE FUSION NETWORK FOR GHOST IMAGING OBJECT DETECTION
    Hu, Nan
    Ma, Huimin
    Le, Chao
    Shao, Xuehui
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 351 - 355
  • [8] ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Ngo, Chong-Wah
    Mei, Tao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18021 - 18030
  • [9] FuseNet: a multi-modal feature fusion network for 3D shape classification
    Zhao, Xin
    Chen, Yinhuang
    Yang, Chengzhuan
    Fang, Lincong
    VISUAL COMPUTER, 2025, 41 (04): : 2973 - 2985
  • [10] Deep multi-scale and multi-modal fusion for 3D object detection
    Guo, Rui
    Li, Deng
    Han, Yahong
    PATTERN RECOGNITION LETTERS, 2021, 151 : 236 - 242