Pedestrian segmentation and detection in multi-scene based on G-UNet

被引:0
|
作者
Chen X.-Y. [1 ]
Bei X.-Y. [1 ]
Yao Q. [1 ]
Jin X. [1 ]
机构
[1] School of Electrical Engineering, Guangxi University, Nanning
关键词
Computer application; Gaussian kernel; Pedestrian semantic segmentation; Soft connection; Target enhancement loss;
D O I
10.13229/j.cnki.jdxbgxb20200912
中图分类号
学科分类号
摘要
Current semantic segmentation methods can obtain the outline of pedestrians, but when pedestrians block each other, the information such as the number, height and central position of the pedestrians in the figure can not be directly obtained. To solve this problem, we propose the G-UNet model algorithm that, in addition to the semantic segmentation backbone, a Gaussian ellipse density kernel detection branch of the pedestrian area is added, so that the center position, height and width of the pedestrian are detected respectively by the maximum point of the kernel, vertical axis and horizontal axis scale. The uniqueness of the maximum point of density kernel solves the detection problem of pedestrian occlusion. Besides, UNet stiffly concatenates the features of the bottom layer and the top layer in a spatial symmetric way so that 50% fixed errors are directly transmitted to the bottom layer. Then, we propose a trainable Soft Connection to obtain the optimal error distribution propagation method. Finally, to solve the problem that the value of the traditional loss function is proportional to the demarcated area of pedestrian, which makes small-scale pedestrians easy to be undetected, an Objective Enhanced Loss was proposed to improve the ability of detecting small-scale pedestrians of network. In the self-built pedestrian segmentation dataset, the experimental results show that the proposed method is effective and superior to other methods. © 2022, Jilin University Press. All right reserved.
引用
收藏
页码:925 / 933
页数:8
相关论文
共 15 条
  • [1] Saeidi M, Ahmadi A., Deep learning based on CNN for pedestrian detection: an overview and analysis, The 9th International Symposium on Telecommunications (IST), pp. 108-112, (2018)
  • [2] Dalal N, Triggs B., Histograms of oriented gradients for human detection, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2005)
  • [3] Girshick R, Donahue J, Darrell T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, IEEE Conference on Computer Vision and Pattern Recognition, (2014)
  • [4] Girshick R., Fast R-CNN, IEEE International Conference on Computer Vision (ICCV), (2015)
  • [5] Ren S, He K, Girshick R, Et al., Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis & Machine Intelligence, 39, 6, pp. 1137-1149, (2017)
  • [6] Redmon J, Divvala S, Girshick R, Et al., You only look once: unified, real-time object detection, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016)
  • [7] Redmon J, Farhadi A., YOLO9000: Better, faster, stronger, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517-6525, (2017)
  • [8] Redmon J, Farhadi A., YOLOv3: An Incremental Improvement
  • [9] Tayara H, Kim G S., Vehicle detection and counting in high-resolution aerial images using convolutional regression neural network, IEEE Access, 6, pp. 2220-2230, (2017)
  • [10] Chen X Y, Lin J Y, Xiang S M, Et al., Detecting maneuvering target accurately based on a two-phase approach from remote sensing imagery, IEEE Geoence and Remote Sensing Letters, 17, 5, pp. 849-853, (2020)