Encoding-Decoding Multi-Scale Convolutional Neural Network for Crowd Counting

被引:3
|
作者
Meng Y. [1 ]
Ji T. [1 ]
Liu G. [1 ]
Xu S. [1 ]
Li T. [1 ]
机构
[1] School of Information and Control Engineering, Xi'an University of Architecture and Technology, Xi'an
关键词
Atrous space pyramid pooling; Counting error; Crowd counting; Encoding-decoding; Loss function; Multi-scale;
D O I
10.7652/xjtuxb202005020
中图分类号
学科分类号
摘要
Aiming at the problems of multi-scale feature information loss, poor fusion and low quality of density map in the crowd counting method based on multi-column convolutional neural network, a new crowd counting method is proposed based on encoding-decoding multi-scale convolutional neural network. The encoder part adopts multi-column convolution to capture multi-scale features, expands the receptive field and reduces the amount of calculation via atrous space pyramid pooling, and retains the multi-scale feature and the context information of the image. The decoder part upsamples the encoder output to achieve effective fusion of the features with rich high-level semantic information and the features with rich low-level detail information to improve the output quality of the density map. To enhance the sensitivity of the network to counting, a new loss function is proposed by considering the previous pixel space loss and the counting error. Contrast experiments with previous methods on Shanghai Tech, Mall, and self-built datasets are conducted, and it is found that the mean absolute error and mean square error of this method on part_A of Shanghai Tech dataset are 8.3% and 21.3% lower than the previous optimal method, and 12.9% and 12.0% lower in part_B of Shanghai Tech dataset. The mean absolute error and mean square error decrease by 15.1% and 23.8% for the Mall dataset, and decrease by 13.5% and 7.1% for the self-built dataset. The experimental results on Shanghai Tech, Mall and self-built datasets show the higher accuracy and better robustness of the proposed method than the traditional methods. © 2020, Editorial Office of Journal of Xi'an Jiaotong University. All right reserved.
引用
收藏
页码:149 / 157
页数:8
相关论文
共 26 条
  • [1] Chen K., Loy C.C., Gong S., Et al., Feature mining for localised crowd counting, Proceedings of the 2012 British Machine Vision Conference (BMVC), pp. 3-13, (2012)
  • [2] Cai Z., Yu Z.L., Liu H., Counting people in crowded scenes by video analyzing, Proceedings of the 2014 IEEE 9th Conference on Industrial Electronics and Applications (ICIEA), pp. 1841-1845, (2014)
  • [3] Chen T.Y., Chen C.H., Wang D.J., Et al., A people counting system based on face-detection, Proceedings of the 4th International Conference on Genetic and Evolutionary Computing (ICGEC), pp. 699-702, (2011)
  • [4] Marana A.N., Velastin S.A., Costa L.F., Et al., Estimation of crowd density using image processing, Proceedings of the 1997 IEE Colloquium on Image Processing for Security Applications, (1997)
  • [5] Kilamb P., Ribnick E., Joshi A.J., Et al., Estimating pedestrian counts in groups, Computer Vision and Image Understanding, 110, 1, pp. 43-59, (2008)
  • [6] Chen K., Gong S., Xiang T., Et al., Cumulative attribute space for age and crowd density estimation, Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2467-2474, (2013)
  • [7] Change L.C., Gong S., Xiang T., From semi-supervised to transfer counting of crowds, Proceedings of the 2013 IEEE International Conference on Computer Vision(CVPR), pp. 2256-2263, (2013)
  • [8] Cho S.Y., Chow T.W.S., Leung C.T., A neural-based crowd estimation by hybrid global learning algorithm, IEEE Transactions on Cybernetics, 29, 4, pp. 535-541, (1999)
  • [9] Ren S., He K., Girshick R., Et al., Faster R-CNN: towards real-time object detection with region proposal networks, Proceedings of the 2015 Annual Conference on Neural Information Processing Systems (NIPS), pp. 91-99, (2015)
  • [10] Cao Y., Ming Y., He G., Et al., Artificial recognition of centrifugal pump cavitation status based on deep learning, Journal of Xi'an Jiaotong University, 51, 11, pp. 165-172, (2017)