Autonomous driving semantic segmentation with convolution neural networks

被引：0

作者：

Wang Z.-Y. ^{[1
]}

Ni X.-Y. ^{[1
]}

Shang Z.-D. ^{[2
]}

机构：

[1] School of Instrumentation Science and Opto-Electronics Engineering, Beihang University, Beijing

[2] School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang

来源：

Guangxue Jingmi Gongcheng/Optics and Precision Engineering | 2019年 / 27卷 / 11期

关键词：

Autonomous driving; Convolutional neural networks; DeepLab v3+; Semantic image segmentation;

D O I：

10.3788/OPE.20192711.2429

中图分类号：

学科分类号：

摘要：

Semantic image segmentation is an essential part of modern autonomous driving systems because accurate understanding of the scene around the car is the key to navigation and motion planning. The existing advanced convolutional neural network-based semantic segmentation model DeepLab v3+ can not use attention information, which leads to rough segmentation boundary. To improve the semantic image segmentation accuracy for autonomous driving scenario, this paper proposed a segmentation model that combined the low pixel information with channel and spatial information. By inserting the attention module in the convolutional neural network, image semantic level information could be extracted, and more abundant features could be obtained through learning the position information and channel information of the image. The unary potential was figured out from the scores of each category output of the convolutional neural network, and the pairwise potential was obtained from the preliminary segmentation and the original input image, so that every pixel of the image could be modeled by fully connected conditional random fields, and the local details of the image could be optimized. The final result of semantic segmentation was obtained from fully connection conditional random fields through iteration. Compared with the existing DeepLab v3+ network, the improved model can promote key indicators such as mean intersection over union(mIoU) and mean pixel accuracy(mPA) by 1.07 and 3.34 percentage points respectively. It is able to segment objects more finely, and suppress the excessive smoothness of the boundary region segmentation, unreasonable islands preferably. © 2019, Science Press. All right reserved.

引用

页码：2429 / 2438

页数：9

共 17 条

[1] Shelhamer E., Long J., Darrell T., Fully convolutional networks for semantic segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 4, pp. 640-651, (2017)
[2] Pan X.Z., Zhang S.Q., Guo W.P., Et al., Video-based facial expression recognition using multimodal deep convolutional neural networks, Opt. Precision Eng., 27, 4, pp. 963-970, (2019)
[3] Li Y., Liu X.Y., Zhang H.Q., Et al., Optical remote sensing image retrieval based on convolutional neural networks, Opt. Precision Eng., 26, 1, pp. 200-207, (2018)
[4] Guo B.Q., Wang N., Pedestrian intruding railway clearance classification algorithm based on improved deep convolutional network, Opt. Precision Eng., 26, 12, pp. 3040-3050, (2018)
[5] Pohlen T., Hermans A., Mathias M., Et al., Full-resolution residual networks for semantic segmentation in street scenes, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017)
[6] He K., Zhang X., Ren S., Et al., Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[7] Yang M., Yu K., Zhang C., Et al., DenseASPP for Semantic Segmentation in Street Scenes, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3684-3692, (2018)
[8] Wang F., Jiang M., Qian C., Et al., Residual attention network for image classication, 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450-6458, (2017)
[9] Hu J., Shen L., Sun G., Squeeze-and-excitation networks, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, (2018)
[10] Chen L., Zhu Y., Papandreou G., Et al., Encoder-decoder with atrous separable convolution for semantic image segmentation, European Conference on Computer Vision, pp. 833-851, (2018)

← 1 2 →