3D Object Detection Based on Voxel Self-Attention Auxiliary Networks

被引:0
|
作者
Cao, Jie [1 ]
Peng, Yiqiang [1 ,2 ,3 ]
Fan, Likang [1 ,2 ,3 ]
Wang, Longfei [1 ]
机构
[1] Xihua Univ, Sch Automobile & Transportat, Chengdu 610039, Sichuan, Peoples R China
[2] Xihua Univ, Vehicle Measurement Control & Safety Key Lab Sichu, Chengdu 610039, Sichuan, Peoples R China
[3] Prov Engn Res Ctr New Energy Vehicle Intelligent C, Chengdu 610039, Sichuan, Peoples R China
关键词
LiDAR; object detection; automatic drive; voxel; self-; attention; VISION;
D O I
10.3788/LOP240923
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A voxel self-attention auxiliary (VSAA) network is proposed to address the issue of poor detection performance in LiDAR object detection algorithms for autonomous driving scenes. This issue stems from a lack of deep understanding of the spatial structure, owing to its reliance on a convolutional neural network (CNN). VSAA network can be directly applied to most voxel-based target detection algorithms to enhance its feature extraction capabilities. First, the VSAA network enhances the efficiency of searching relevant voxels in subsequent self-attention calculations by further constructing voxel hash tables for secondary encoding, based on the foundation of voxel feature encoding. Second, VSAA network applies the self-attention mechanism at the voxel level to capture comprehensive global information and profound contextual semantic information. Finally, this study proposes the VA-SECOND and VA-PVRCNN algorithms by applying VSAA network to the benchmark algorithms SECOND and PV-RCNN, respectively. The features of VSAA network and CNN are fused to compensate for the disadvantage of the small receptive field of the CNN, thus enhancing the detection ability of the algorithm and allowing it to understand an entire spatial scene. Experimental results obtained using the KITTI dataset show that, compared with the benchmark algorithms, VA-SECOND and VA-PVRCNN algorithms improve the average detection accuracy of all detected targets by 1.16 percentage point and 1.54 percentage point, respectively, which proves the effectiveness of the VSAA network.
引用
收藏
页数:10
相关论文
empty
未找到相关数据