Fine-Grained Image Classification Algorithm Using Multi-Scale Feature Fusion and Re-Attention Mechanism

被引：1

作者：

He K. ^{[1
]}

Feng X. ^{[1
]}

Gao S. ^{[1
]}

Ma X. ^{[1
]}

机构：

[1] School of Electrical and Information Engineering, Tianjin University, Tianjin

来源：

Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology | 2020年 / 53卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Fine-grained image classification; Multi-scale feature fusion; Re-attention mechanism; ResNet50;

D O I：

10.11784/tdxbz201910029

中图分类号：

学科分类号：

摘要：

Fine-grained image classification aims to precisely classify an image subclass under a certain category. Hence, it has become a commonand difficult point in the field of computer vision and pattern recognition and has important research value due to its similar features, different gestures, and background interference. The key issue in fine-grained image classification is how to extract precise features from the discriminative region of an image. Existing algorithms based on neural networks are still insufficient in fine feature extraction. Accordingly, a fine-grained image classification algorithm using multi-scale re-attention mechanism is proposed in this study. Considering that high- and low-level features have rich semantic and texture information, respectively, attention mechanism is embedded in different scales to obtain rich feature information. In addition, an input feature map is processed with both channel and spatial attention, which can be regarded as the re-attention of a feature matrix. Finally, using the residual form to combine the attention results and original input feature maps, the attention results on the feature maps of different scales are concatenated and fed into the full connection layer. Thus, accurately extracting salient features is helpful. Accuracy rates of 86.16%, 92.26%, and 93.40% are obtained on the international public fine-grained datasets(CUB-200-2011, FGVC Aircraft, and Stanford Cars). Compared with ResNet50, the accuracy rate is increased by 1.66%, 1.46%, and 1.10%, respectively. It is obviously higher than that of existing classical algorithms and human performance, which demonstrate the effectiveness of the proposed algorithm. © 2020, Editorial Board of Journal of Tianjin University(Science and Technology). All right reserved.

引用

页码：1077 / 1085

页数：8

共 32 条

[1] Srivastava A, Han E, Kumar V, Et al., Parallel formulations of decision-tree classification algorithms, Proceedings of the International Conference on Parallel Processing(ICPP), pp. 237-244, (1998)
[2] Guo Gongde, Wang Hui, Bell D A, Et al., KNN model-based approach in classification, OTM Confederated International Conferences CoopIS, DOA, and ODBASE, pp. 986-996, (2003)
[3] Mao Q H, Ma H W, Zhang X H., SVM classification model parameters optimized by improved genetic algorithm, Advanced Materials Research, 889, 890, pp. 617-621, (2014)
[4] Coskun N, Yildirim T., The effects of training algorithms in MLP network on image classification, Proceedings of the International Joint Conference on IEEE, pp. 1223-1226, (2003)
[5] Krizhevsky A, Sutskever I, Hinton G., ImageNet classification with deep convolutional neural networks, 26th Annual Conference on Neural Information Processing Systems 2012, pp. 1097-1105, (2012)
[6] Simonyan K, Zisserman A., Very deep convolutional networks for large-scale image recognition, 3rd International Conference on Learning Representations, pp. 1-14, (2015)
[7] Ioffe S, Szegedy C., Batch normalization: Accelerating deep network training by reducing internal covariate shift, 32nd International Conference on Machine Learning, pp. 448-456, (2015)
[8] He Kaiming, Zhang Xiangyu, Ren Shaoqing, Et al., Deep residual learning for image recognition, 29th IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[9] Howard A G, Zhu Menglong, Bo Chen, Et al., MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
[10] Huang Gao, Liu Zhuang, van der Maaten L, Et al., Densely connected convolutional networks, Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2261-2269, (2017)

← 1 2 3 4 →