Pedestrian detection based on developed YOLOv3 with ResNet34_D

被引:0
|
作者
Qian H.-M. [1 ]
Chen W. [1 ]
Ma Y.-L. [1 ]
Shi F. [1 ]
Xiang W.-B. [2 ]
机构
[1] College of Energy and Electrical Engineering, Hohai University, Nanjing
[2] College of Automation, Nanjing University of Science and Technology, Nanjing
来源
Kongzhi yu Juece/Control and Decision | 2022年 / 37卷 / 07期
关键词
Deep learning; DIoU; Pedestrian detection; ResNet34_D; YOLOv3;
D O I
10.13195/j.kzyjc.2021.0136
中图分类号
学科分类号
摘要
Pedestrian detection is one of the main tasks of autonomous driving. The existed deep neural network is lack of the ability to detect small-size or medium-size objects and occluded objects, which is the requirement of pedestrian detection since pedestrians in the images acquired by vehicle-equipped cameras are always small or medium or occluded. In this paper, an improved YOLOv3 model based on ResNet34_D is proposed for pedestrian detection. And the contributions of the improved model are as follows. Firstly, the developed residual network ResNet34_D by modifying the structure of convolutional block is proposed, and it is selected as the backbone of YOLOv3 to reduce the size of the model so as to decrease the training difficulty. Secondly, the SPP layer and the DropBlock module are introduced after the feature maps of three stages of ResNet34_D, which can improve the detection accuracy of pedestrian objects with different sizes. Thirdly, to further increase the detection accuracy, the multi-scale anchors are determined using the $K$-means. Finally, the DIoU loss function is used to improve the ability of detecting the occluded objects. Ablation experiments for the proposed model demonstrate the effectiveness of each developed technologies in improving detection accuracy. And more experimental results show that the AP$_{50}$ of the proposed model on BDD100K-Person dataset reaches 69.8%, and the detection speed can achieve 130 FPS. Comparison experiments between the proposed method and the other existed methods demonstrate that, using the proposed method, the false detection rate for small targets and occlusion targets is lower, and the speed is faster, therefore, the proposed improved YOLOv3 model based on Resnet34_D is valuable in practical applications. Copyright ©2022 Control and Decision.
引用
收藏
页码:1713 / 1720
页数:7
相关论文
共 28 条
  • [1] Zou Y Q, Xiao Z H, Tang X F, Et al., Anchor-free scale adaptive pedestrian detection algorithm, Control and Decision, 36, 2, pp. 295-302, (2021)
  • [2] Zhao P, Xu B P, Yan S, Et al., A scene text detection based on dual-path feature fusion, Control and Decision, 36, 9, pp. 2179-2186, (2021)
  • [3] Girshick R, Donahue J, Darrell T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
  • [4] Girshick R., Fast R-CNN, IEEE International Conference on Computer Vision, pp. 1440-1448, (2015)
  • [5] Ren S Q, He K M, Girshick R, Et al., Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 6, pp. 1137-1149, (2017)
  • [6] He K, Gkioxari G, Doll'{a}r P, Et al., Mask R-CNN, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2961-2969, (2017)
  • [7] Cai Z W, Vasconcelos N., Cascade R-CNN: Delving into high quality object detection, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6154-6162, (2018)
  • [8] Redmon J, Divvala S, Girshick R, Et al., You only look once: Unified, real-time object detection, IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, (2016)
  • [9] Redmon J, Farhadi A., YOLO9000: better, faster, stronger, IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517-6525, (2017)
  • [10] Liu W, Anguelov D, Erhan D., SSD: Single shot MultiBox detector, Proceedings of European Conference on Computer Vision, 9905, pp. 21-37, (2016)