DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion

被引:10
|
作者
Bi, Jiangfeng [1 ]
Wei, Haiyue [1 ]
Zhang, Guoxin [1 ]
Yang, Kuihe [1 ]
Song, Ziying [2 ]
机构
[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China
[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China
关键词
cross-attention dynamic fusion; synchronous data augmentation; 3D object detection; CNN;
D O I
10.1109/TLA.2024.10412035
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the realm of autonomous driving, LiDAR and camera sensors play an indispensable role, furnishing pivotal observational data for the critical task of precise 3D object detection. Existing fusion algorithms effectively utilize the complementary data from both sensors. However, these methods typically concatenate the raw point cloud data and pixel-level image features, unfortunately, a process that introduces errors and results in the loss of critical information embedded in each modality. To mitigate the problem of lost feature information, this paper proposes a Cross-Attention Dynamic Fusion (CADF) strategy that dynamically fuses the two heterogeneous data sources. In addition, we acknowledge the issue of insufficient data augmentation for these two diverse modalities. To combat this, we propose a Synchronous Data Augmentation (SDA) strategy designed to enhance training efficiency. We have tested our method using the KITTI and nuScenes datasets, and the results have been promising. Remarkably, our top-performing model attained an 82.52% mAP on the KITTI test benchmark, outperforming other state-of-the-art methods.
引用
收藏
页码:106 / 112
页数:7
相关论文
共 50 条
  • [31] A joint object detection and semantic segmentation model with cross-attention and inner-attention mechanisms
    Nan, Zhixiong
    Peng, Jizhi
    Jiang, Jingjing
    Chen, Hui
    Yang, Ben
    Xin, Jingmin
    Zheng, Nanning
    NEUROCOMPUTING, 2021, 463 : 212 - 225
  • [32] Inter-Frame Multiscale Probabilistic Cross-Attention for Surveillance Object Detection
    Xu, Huanhuan
    Hu, Xiyuan
    Zhou, Yichao
    2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024, 2024, : 565 - 573
  • [33] LCASAFormer: Cross-attention enhanced backbone network for 3D point cloud tasks
    Guo, Shuai
    Cai, Jinyin
    Hu, Yazhou
    Liu, Qidong
    Xu, Mingliang
    PATTERN RECOGNITION, 2025, 162
  • [34] MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection
    Jeong, David C.
    Shen, Tianma
    Liu, Hongji
    Kapoor, Raghav
    Nguyen, Casey
    Liu, Song
    Kitts, Christopher A.
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9846 - 9852
  • [35] Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention
    Zhao, Pujie
    Ye, Xia
    Du, Ziang
    SENSORS, 2024, 24 (13)
  • [36] PointGAT: Graph attention networks for 3D object detection
    Zhou H.
    Wang W.
    Liu G.
    Zhou Q.
    Intelligent and Converged Networks, 2022, 3 (02): : 204 - 216
  • [37] Cross-Modality 3D Object Detection
    Zhu, Ming
    Ma, Chao
    Ji, Pan
    Yang, Xiaokang
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3771 - 3780
  • [38] Dynamic graph transformer for 3D object detection
    Ren, Siyuan
    Pan, Xiao
    Zhao, Wenjie
    Nie, Binling
    Han, Bo
    KNOWLEDGE-BASED SYSTEMS, 2023, 259
  • [39] Optical remote sensing image salient object detection via bidirectional cross-attention and attention restoration
    Gu, Yubin
    Chen, Siting
    Sun, Xiaoshuai
    Ji, Jiayi
    Zhou, Yiyi
    Ji, Rongrong
    PATTERN RECOGNITION, 2025, 164
  • [40] Object DGCNN: 3D Object Detection using Dynamic Graphs
    Wang, Yue
    Solomon, Justin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34