Lightweight Semantic Segmentation Method Based on Local Window Cross Attention

被引:0
|
作者
Jin Z. [1 ]
Wei H. [1 ]
Zheng L. [1 ,2 ]
Lou L. [1 ]
Zheng G. [1 ]
机构
[1] School of Electromechanical and Vehicle Engineering, Chongqing Jiaotong University, Chongqing
[2] University of British Columbia Okanagan, Kelowna, BC
来源
关键词
BEV; cross-attention; local window; semantic segmentation;
D O I
10.19562/j.chinasae.qcgc.2023.09.010
中图分类号
学科分类号
摘要
For the environmental perception of autonomous vehicle,the application of circumnavigation cameras in the Bird's Eye View(BEV)coordinate for semantic segmentation of lanes,vehicles and other targets has attracted wide attention. For the problems of linear increase of task inference delay due to the increasing number of cameras as well as difficulty in completing semantic segmentation tasks in real-time in autonomous driving percep⁃ tion,this paper proposes a lightweight semantic segmentation method based on local window cross-attention. The model adopts the improved EdgeNeXt backbone network to extract features. By constructing the local window cross attention between BEV query and image features,the feature query between the cross-camera perspectives is con⁃ structed. Then,the fused BEV feature map is decoded by up sampling residual block to obtain the BEV semantic segmentation results. The experimental results on the nuScenes dataset show that the proposed method achieves 35.1% mIoU in the lane static segmentation task of BEV map,which is 2.2% higher than that of HDMapNet. In par⁃ ticular,the inference speed increases by 58.2% compared with that of GKT,with the frame detection rate reaching 106 FPS. © 2023 SAE-China. All rights reserved.
引用
收藏
页码:1617 / 1625
页数:8
相关论文
共 22 条
  • [1] WANG H, CAI Y F,, Et al., Detection of water⁃covered and wet areas on road pavement based on semantic segmentation network[J], Automotive Engineering, 43, 4, pp. 485-491, (2021)
  • [2] GAO T,, XING K, LIU Z W,, Et al., Traffic sign detection algo⁃ rithm based on pyramid multi-scale fusion[J], Journal of Traffic and Transportation Engineering, 22, 3, pp. 210-224, (2022)
  • [3] CHEN L C,, PAPANDREOU G, KOKKINOS I, Et al., DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 4, pp. 834-848, (2017)
  • [4] BROX T., U⁃Net:convolution⁃ al networks for biomedical image segmentation[C], 2015 Interna⁃ tional Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI 2015), pp. 234-241, (2015)
  • [5] Inverse perspec⁃ tive mapping simplifies optical flow computation and obstacle de⁃ tection[J], Biological Cybernetics, 64, 3, pp. 177-185, (1991)
  • [6] 2020 European Conference on Computer Vision(ECCV 2020), pp. 194-210, (2020)
  • [7] 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12689-12697, (2019)
  • [8] MUREZ Z, MOHAN N, Et al., FIERY:future instance prediction in bird's-eye view from surround monocular cameras [C], 2021 IEEE International Conference on Computer Vision (ICCV2021), pp. 15253-15262, (2021)
  • [9] HUANG J, ZHU Z, Et al., Bevdet:highperformance multi-camera 3D object detection in bird-eye-view[J]
  • [10] LEUNG H Y T,, Et al., Cross view semantic seg⁃ mentation for sensing surroundings[J], IEEE Robotics and Auto⁃ mation Letters, 5, 3, pp. 4867-4873, (2020)