Fusing Attention Features and Contextual Information for Scene Recognition

被引:1
|
作者
Peng, Yuqing [1 ]
Liu, Xianzi [2 ]
Wang, Chenxi [1 ]
Xiao, Tengfei [1 ]
Li, Tiejun [3 ]
机构
[1] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China
[2] China Shenhua Int Engn Co Ltd, Beijing 100007, Peoples R China
[3] Hebei Univ Technol, Sch Mech Engn, Tianjin 300401, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene recognition; muti-scale attention; joint supervision; context information; CLASSIFICATION;
D O I
10.1142/S0218001422500148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Aiming to obtain more discriminative features in scene images and overcome the impacts of intra-class differences and inter-class similarities, the paper proposes a scene recognition method that combines attention and context information. First, we introduce the attention mechanism and build a multi-scale attention model. Discriminative information considers salient objects and regions by means of channel attention and spatial attention. Besides, the central loss function joint supervision strategy is introduced to further reduce the misjudgment of intra-class differences. Second, a model based on multi-level context information is proposed to describe the positional relationship between objects, which can effectively alleviate the influence of the similarity of objects between classes. Finally, the two models are merged to give full play to the compatibility of features, so that the final feature representation not only focuses on the effective discriminant information, but also manifests the relative position relationship between significant objects. Extensive experiments have proved that the method in this paper effectively solves the problem of insufficient feature representation in scene recognition tasks, and improves the accuracy of scene recognition.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Graph attention mechanism with global contextual information for multi-label image recognition
    Ban, Xiaoxiao
    Li, Peihua
    Wang, Qilong
    Zhou, Shoujun
    Guo, Shijie
    Wang, Yuanquan
    JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
  • [22] Facial Expression Recognition: One Attention-Modulated Contextual Spatial Information Network
    Li, Xue
    Zhu, Chunhua
    Zhou, Fei
    ENTROPY, 2022, 24 (07)
  • [23] Fusing Facial Texture Features for Face Recognition
    Yanqing Shao
    Chaowei Tang
    Min Xiao
    Hui Tang
    Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 2016, 86 : 395 - 403
  • [24] Selective attention to contextual information in Japan
    Ishii, K
    Kitayama, S
    PROCEEDINGS OF THE TWENTY-FIFTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, PTS 1 AND 2, 2003, : 1358 - 1358
  • [25] Violence Detection Through Fusing Visual Information to Auditory Scene
    Li, Hongwei
    Ma, Lin
    Min, Xinyu
    Li, Haifeng
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 208 - 220
  • [26] Deep Contextual Stroke Pooling for Scene Character Recognition
    Zhang, Zhong
    Wang, Hong
    Liu, Shuang
    Xiao, Baihua
    IEEE ACCESS, 2018, 6 : 16454 - 16463
  • [27] CRABR-Net: A Contextual Relational Attention-Based Recognition Network for Remote Sensing Scene Objective
    Guo, Ningbo
    Jiang, Mingyong
    Gao, Lijing
    Tang, Yizhuo
    Han, Jinwei
    Chen, Xiangning
    SENSORS, 2023, 23 (17)
  • [28] Fusing binaural sonar information for object recognition
    Kue, R
    MF '96 - 1996 IEEE/SICE/RSJ INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS, 1996, : 727 - 735
  • [29] Intrinsic and contextual features in object recognition
    Schlangen, Derrick
    Barenholtz, Elan
    JOURNAL OF VISION, 2015, 15 (01):
  • [30] Speech Emotion Recognition Model Based on Attention CNN Bi-GRU Fusing Visual Information
    Hu, Zhangfang
    Wang, Lan
    Luo, Yuan
    Xia, Yanling
    Xiao, Hang
    ENGINEERING LETTERS, 2022, 30 (02)