A weakly supervised spatial group attention network for fine-grained visual recognition

被引:8
|
作者
Xie, Jiangjian [1 ,2 ,3 ]
Zhong, Yujie [1 ]
Zhang, Junguo [1 ,2 ]
Zhang, Changchun [1 ,2 ]
Schuller, Bjoern W. [3 ,4 ,5 ]
机构
[1] Beijing Forestry Univ, Sch Technol, Beijing 100083, Peoples R China
[2] Beijing Forestry Univ, Res Ctr Biodivers Intelligent Monitoring, Beijing 100083, Peoples R China
[3] Univ Augsburg, Chair Embedded Intelligence Hlth Care & Wellbeing, D-86159 Augsburg, Germany
[4] Imperial Coll London, GLAM Grp Language Audio & Mus, London SW7 2AZ, England
[5] Univ Augsburg, Ctr Interdisciplinary Hlth Res, D-86159 Augsburg, Germany
关键词
Classification; Fine-grained image; Bird recognition; Weakly supervised network; Moment exchange; Spatial group attention;
D O I
10.1007/s10489-023-04627-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fine-grained visual recognition is to classify several sub-categories affiliated to the same basic-level category, which is highly challenging because the same sub-category with large variance and different sub-categories with small variance. Previously approaches generally localize the targets or parts first, then determine which sub-category the image is attached to. They depend on target or part annotations, which are labor-intensive and a barrier to moving towards practical use. Other methods indirectly extract recognizable areas from the high-level feature maps, ignoring the spatial relationships between the target and its parts, which may cause inaccurate recognition. In this paper, we propose a weakly supervised spatial group attention network (WSSGA-Net) for fine-grained bird recognition. According to the spatial relationships between the target and its parts, we embed the spatial group attention (SGA) module into the WSSGA-Net to highlight the correct semantic feature regions by establishing a semantic feature space enhancement mechanism. In addition, we apply moment exchange (MoEx) to generate new feature maps by exchanging two input image feature moments for data augmentation. Comprehensive experiments indicate that our approach significantly has a better performance than the state-of-the-art approaches on the standard bird image datasets Bird-65, CUB200-2011 and fine-grained dataset Stanford Cars.
引用
收藏
页码:23301 / 23315
页数:15
相关论文
共 50 条
  • [41] The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking
    Ye, Shuo
    Wang, Yu
    Peng, Qinmu
    You, Xinge
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 2 - 16
  • [42] Weakly Supervised Learning of Part Selection Model with Spatial Constraints for Fine-Grained Image Classification
    He, Xiangteng
    Peng, Yuxin
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4075 - 4081
  • [43] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [44] Weakly supervised fine-grained semantic segmentation via spatial correlation-guided learning
    Dong, Zihao
    Fang, Tiyu
    Li, Jinping
    Shao, Xiuli
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 236
  • [45] Annotation modification for fine-grained visual recognition
    Luo, Changzhi
    Meng, Zhijun
    Feng, Jiashi
    Ni, Bingbing
    Wang, Meng
    NEUROCOMPUTING, 2018, 274 : 58 - 65
  • [46] Fine-Grained Radio Frequency Fingerprint Recognition Network Based on Attention Mechanism
    Zhang, Yulan
    Hu, Jun
    Jiang, Rundong
    Lin, Zengrong
    Chen, Zengping
    ENTROPY, 2024, 26 (01)
  • [47] Weakly supervised fine-grained image classification via two-level attention activation model
    Ke, Xiao
    Huang, Yanyan
    Guo, WenZhong
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 218
  • [48] Fine-grained vehicle type detection and recognition based on dense attention network
    Ke, Xiao
    Zhang, Yufeng
    NEUROCOMPUTING, 2020, 399 : 247 - 257
  • [49] Multiple Recurrent Attention Convolutional Neural Network For fine-grained image recognition
    Zhu, Xiaotong
    Bian, Hengwei
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 44 - 48
  • [50] Fine-Grained Visual Classification Network Based on Fusion Pooling and Attention Enhancement
    Xiao B.
    Guo J.
    Zhang X.
    Wang M.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (07): : 661 - 670