DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion

被引：10

作者：

Bi, Jiangfeng ^{[1
]}

Wei, Haiyue ^{[1
]}

Zhang, Guoxin ^{[1
]}

Yang, Kuihe ^{[1
]}

Song, Ziying ^{[2
]}

机构：

[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China

[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China

来源：

IEEE LATIN AMERICA TRANSACTIONS | 2024年 / 22卷 / 02期

关键词：

cross-attention dynamic fusion; synchronous data augmentation; 3D object detection; CNN;

D O I：

10.1109/TLA.2024.10412035

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the realm of autonomous driving, LiDAR and camera sensors play an indispensable role, furnishing pivotal observational data for the critical task of precise 3D object detection. Existing fusion algorithms effectively utilize the complementary data from both sensors. However, these methods typically concatenate the raw point cloud data and pixel-level image features, unfortunately, a process that introduces errors and results in the loss of critical information embedded in each modality. To mitigate the problem of lost feature information, this paper proposes a Cross-Attention Dynamic Fusion (CADF) strategy that dynamically fuses the two heterogeneous data sources. In addition, we acknowledge the issue of insufficient data augmentation for these two diverse modalities. To combat this, we propose a Synchronous Data Augmentation (SDA) strategy designed to enhance training efficiency. We have tested our method using the KITTI and nuScenes datasets, and the results have been promising. Remarkably, our top-performing model attained an 82.52% mAP on the KITTI test benchmark, outperforming other state-of-the-art methods.

引用

页码：106 / 112

页数：7

共 50 条

[31] A joint object detection and semantic segmentation model with cross-attention and inner-attention mechanisms
Nan, Zhixiong
Peng, Jizhi
Jiang, Jingjing
Chen, Hui
Yang, Ben
Xin, Jingmin
Zheng, Nanning
NEUROCOMPUTING, 2021, 463 : 212 - 225
[32] Inter-Frame Multiscale Probabilistic Cross-Attention for Surveillance Object Detection
Xu, Huanhuan
Hu, Xiyuan
Zhou, Yichao
2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024, 2024, : 565 - 573
[33] LCASAFormer: Cross-attention enhanced backbone network for 3D point cloud tasks
Guo, Shuai
Cai, Jinyin
Hu, Yazhou
Liu, Qidong
Xu, Mingliang
PATTERN RECOGNITION, 2025, 162
[34] MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection
Jeong, David C.
Shen, Tianma
Liu, Hongji
Kapoor, Raghav
Nguyen, Casey
Liu, Song
Kitts, Christopher A.
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9846 - 9852
[35] Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention
Zhao, Pujie
Ye, Xia
Du, Ziang
SENSORS, 2024, 24 (13)
[36] PointGAT: Graph attention networks for 3D object detection
Zhou H.
Wang W.
Liu G.
Zhou Q.
Intelligent and Converged Networks, 2022, 3 (02): : 204 - 216
[37] Cross-Modality 3D Object Detection
Zhu, Ming
Ma, Chao
Ji, Pan
Yang, Xiaokang
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3771 - 3780
[38] Dynamic graph transformer for 3D object detection
Ren, Siyuan
Pan, Xiao
Zhao, Wenjie
Nie, Binling
Han, Bo
KNOWLEDGE-BASED SYSTEMS, 2023, 259
[39] Optical remote sensing image salient object detection via bidirectional cross-attention and attention restoration
Gu, Yubin
Chen, Siting
Sun, Xiaoshuai
Ji, Jiayi
Zhou, Yiyi
Ji, Rongrong
PATTERN RECOGNITION, 2025, 164
[40] Object DGCNN: 3D Object Detection using Dynamic Graphs
Wang, Yue
Solomon, Justin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →