A weakly supervised spatial group attention network for fine-grained visual recognition

被引：8

作者：

Xie, Jiangjian ^{[1
,2
,3
]}

Zhong, Yujie ^{[1
]}

Zhang, Junguo ^{[1
,2
]}

Zhang, Changchun ^{[1
,2
]}

Schuller, Bjoern W. ^{[3
,4
,5
]}

机构：

[1] Beijing Forestry Univ, Sch Technol, Beijing 100083, Peoples R China

[2] Beijing Forestry Univ, Res Ctr Biodivers Intelligent Monitoring, Beijing 100083, Peoples R China

[3] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, D-86159 Augsburg, Germany

[4] Imperial Coll London, GLAM Grp Language Audio & Mus, London SW7 2AZ, England

[5] Univ Augsburg, Ctr Interdisciplinary Hlth Res, D-86159 Augsburg, Germany

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 20期

关键词：

Classification; Fine-grained image; Bird recognition; Weakly supervised network; Moment exchange; Spatial group attention;

D O I：

10.1007/s10489-023-04627-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The fine-grained visual recognition is to classify several sub-categories affiliated to the same basic-level category, which is highly challenging because the same sub-category with large variance and different sub-categories with small variance. Previously approaches generally localize the targets or parts first, then determine which sub-category the image is attached to. They depend on target or part annotations, which are labor-intensive and a barrier to moving towards practical use. Other methods indirectly extract recognizable areas from the high-level feature maps, ignoring the spatial relationships between the target and its parts, which may cause inaccurate recognition. In this paper, we propose a weakly supervised spatial group attention network (WSSGA-Net) for fine-grained bird recognition. According to the spatial relationships between the target and its parts, we embed the spatial group attention (SGA) module into the WSSGA-Net to highlight the correct semantic feature regions by establishing a semantic feature space enhancement mechanism. In addition, we apply moment exchange (MoEx) to generate new feature maps by exchanging two input image feature moments for data augmentation. Comprehensive experiments indicate that our approach significantly has a better performance than the state-of-the-art approaches on the standard bird image datasets Bird-65, CUB200-2011 and fine-grained dataset Stanford Cars.

引用

页码：23301 / 23315

页数：15

共 50 条

[41] The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking
Ye, Shuo
Wang, Yu
Peng, Qinmu
You, Xinge
Chen, C. L. Philip
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 2 - 16
[42] Weakly Supervised Learning of Part Selection Model with Spatial Constraints for Fine-Grained Image Classification
He, Xiangteng
Peng, Yuxin
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4075 - 4081
[43] Fine-Grained Crowdsourcing for Fine-Grained Recognition
Jia Deng
Krause, Jonathan
Li Fei-Fei
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
[44] Weakly supervised fine-grained semantic segmentation via spatial correlation-guided learning
Dong, Zihao
Fang, Tiyu
Li, Jinping
Shao, Xiuli
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 236
[45] Annotation modification for fine-grained visual recognition
Luo, Changzhi
Meng, Zhijun
Feng, Jiashi
Ni, Bingbing
Wang, Meng
NEUROCOMPUTING, 2018, 274 : 58 - 65
[46] Fine-Grained Radio Frequency Fingerprint Recognition Network Based on Attention Mechanism
Zhang, Yulan
Hu, Jun
Jiang, Rundong
Lin, Zengrong
Chen, Zengping
ENTROPY, 2024, 26 (01)
[47] Weakly supervised fine-grained image classification via two-level attention activation model
Ke, Xiao
Huang, Yanyan
Guo, WenZhong
COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 218
[48] Fine-grained vehicle type detection and recognition based on dense attention network
Ke, Xiao
Zhang, Yufeng
NEUROCOMPUTING, 2020, 399 : 247 - 257
[49] Multiple Recurrent Attention Convolutional Neural Network For fine-grained image recognition
Zhu, Xiaotong
Bian, Hengwei
2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 44 - 48
[50] Fine-Grained Visual Classification Network Based on Fusion Pooling and Attention Enhancement
Xiao B.
Guo J.
Zhang X.
Wang M.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (07): : 661 - 670

← 1 2 3 4 5 →