Kernelized Bilinear CNN Models for Fine-Grained Visual Recognition

被引：0

作者：

Ge S.-Y. ^{[1
]}

Gao Z.-L. ^{[1
]}

Zhang B.-B. ^{[1
]}

Li P.-H. ^{[1
]}

机构：

[1] School of Information and Communication Engineering, Dalian University of Technology, Dalian, 116024, Liaoning

来源：

Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2019年 / 47卷 / 10期

关键词：

Bilinear convolution neural network; End to end learning; Fine-grained visual recognition; Kernelized bilinear pooling;

D O I：

10.3969/j.issn.0372-2112.2019.10.015

中图分类号：

学科分类号：

摘要：

The bilinear convolutional neural network(B-CNN) has been widely used in computer vision. B-CNN can capture the linear correlation between different channels by performing the outer product operation on the features of the convolutional layer output, thus enhancing the representative ability of the convolutional network. Since the non-linear relationship between the channels in the feature map is not taken account of, this method cannot make full use of the richer information contained between the channels. In order to solve this problem, this paper proposes a kernelized bilinear convolutional neural network employing the kernel function to effectively capture the non-linear relationship between the channels in the feature map, and further enhancing the representative ability of the convolutional network. In this paper, the method is evaluated on three common fine-grained benchmarks CUB-200-2011, FGVC-Aircraft and Cars. Experiments show that our method is superior to its counterparts on all three benchmarks. © 2019, Chinese Institute of Electronics. All right reserved.

引用

页码：2134 / 2141

页数：7

共 30 条

[1] Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp. 1097-1105, (2012)
[2] Deng J., Dong W., Socher R., Et al., Imagenet: A large-scale hierarchical image database, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)
[3] Ke S.-C., Zhao Y.-W., Li B.-C., Et al., Image retrieval based on convolutional neural network and kernel-based supervised hashing, Acta Electronica Sinica, 45, 1, pp. 157-163, (2017)
[4] Wang Z.-Y., Wu Y.-X., Zhang G.-Y., Et al., RGB-D scene parsing based on spatial structured inference deep fusion networks, Acta Electronica Sinica, 46, 5, pp. 1253-1258, (2018)
[5] Li K., Li Y.-M., Hu X.-M., Et al., Robust and accurate object tracking algorithm based on convolutional neural network, Acta Electronica Sinica, 46, 9, pp. 2087-2093, (2018)
[6] Zou C.-M., Luo Y., Xu X.-L., Fine-grained image classification method based on multi-feature combination, Journal of Computer Applications, 38, 7, (2018)
[7] Lin T.Y., Roychowdhury A., Maji S., Bilinear CNN models for fine-grained visual recognition, Proceedings of IEEE International Conference on Computer Vision, pp. 1449-1457, (2015)
[8] Li P., Xie J., Wang Q., Et al., Is second-order information helpful for large-scale visual recognition, Proceedings of IEEE International Conference on Computer Vision, pp. 2070-2078, (2017)
[9] Lin T.Y., Maji S., Improved bilinear pooling with CNNs, British Machine Vision Conference, pp. 1-12, (2017)
[10] Ioffe S., Szegedy C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, International Conference on Machine Learning, pp. 448-456, (2015)

← 1 2 3 →