Fine-Grained Car Recognition Model Based on Semantic DCNN Features Fusion

被引:0
|
作者
Yang J. [1 ]
Cao H. [1 ]
Wang R. [1 ]
Xue L. [1 ]
机构
[1] School of Computer and Information, Hefei University of Technology, Hefei
来源
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2019年 / 31卷 / 01期
关键词
Car recognition; Convolutional neural networks; Deep learning; Fine-grained car recognition; Fine-grained recognition; Image classification;
D O I
10.3724/SP.J.1089.2019.17130
中图分类号
学科分类号
摘要
As the deep convolution neural networks (DCNN) lack the ability of representation of semantic information, while visual differences between classes are small and concentrated on key semantic parts during the fine-grained categorization, this paper proposes a model based on fusing semantic information of DCNN features, which is composed of the detection sub-network and the classification sub-network. Firstly, by use of the detection sub-network we capture the definite position of car object and each semantic parts through Faster RCNN. Secondly, the classification sub-network extracts the overall car object features and semantic parts features of the object via DCNN, then processes the joint and fusion of features by using small kernel convolution. Finally, we obtain final recognition result through deep neural network. The recognition accuracy of our model is 78.74% in Stanford BMW-10 dataset, which is 13.39% higher than the VGG network method and 85.94% in the Stanford cars-197 dataset. And the recognition accuracy of the transfer learning models in BMVC car-types dataset is 98.27%, which is 3.77% higher than the best recognition result of the dataset. Experimental results show that our model avoids the dependence of the fine-grained car recognition on the positions of car object and semantic parts, with high recognition accuracy and versatility. © 2019, Beijing China Science Journal Publishing Co. Ltd. All right reserved.
引用
收藏
页码:141 / 157
页数:16
相关论文
共 42 条
  • [1] Russakovsky O., Deng J., Su H., Et al., ImageNet large scale visual recognition challenge, International Journal of Computer Vision, 115, 3, pp. 211-252, (2015)
  • [2] Everingham M., Eslami S.M.A., van Gool L., Et al., The PASCAL visual object classes challenge: a retrospective, International Journal of Computer Vision, 111, 1, pp. 98-136, (2015)
  • [3] Girshick R., Donahue J., Darrell T., Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, (2014)
  • [4] Uijlings J.R.R., van de Sande K.E.A., Gevers T., Et al., Selective search for object recognition, International Journal of Computer Vision, 104, 2, pp. 154-171, (2013)
  • [5] He K.M., Zhang X.Y., Ren S.Q., Et al., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 9, pp. 1904-1916, (2015)
  • [6] Girshick R., Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, pp. 1440-1448, (2015)
  • [7] Ren S.Q., He K.M., Girshick R., Et al., Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 6, pp. 1137-1149, (2017)
  • [8] Dalal N., Triggs B., Histograms of oriented gradients for human detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 886-893, (2005)
  • [9] Lowe D.G., Object recognition from local scale-invariant features, Proceedings of the IEEE International Conference on Computer Vision, pp. 1150-1157, (1999)
  • [10] Hinton G.E., Osindero S., Teh Y.W., Et al., A fast learning algorithm for deep belief nets, Neural Computation, 18, 7, pp. 1527-1554, (2006)