Fusing Attention Features and Contextual Information for Scene Recognition

被引：1

作者：

Peng, Yuqing ^{[1
]}

Liu, Xianzi ^{[2
]}

Wang, Chenxi ^{[1
]}

Xiao, Tengfei ^{[1
]}

Li, Tiejun ^{[3
]}

机构：

[1] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China

[2] China Shenhua Int Engn Co Ltd, Beijing 100007, Peoples R China

[3] Hebei Univ Technol, Sch Mech Engn, Tianjin 300401, Peoples R China

来源：

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE | 2022年 / 36卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Scene recognition; muti-scale attention; joint supervision; context information; CLASSIFICATION;

D O I：

10.1142/S0218001422500148

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Aiming to obtain more discriminative features in scene images and overcome the impacts of intra-class differences and inter-class similarities, the paper proposes a scene recognition method that combines attention and context information. First, we introduce the attention mechanism and build a multi-scale attention model. Discriminative information considers salient objects and regions by means of channel attention and spatial attention. Besides, the central loss function joint supervision strategy is introduced to further reduce the misjudgment of intra-class differences. Second, a model based on multi-level context information is proposed to describe the positional relationship between objects, which can effectively alleviate the influence of the similarity of objects between classes. Finally, the two models are merged to give full play to the compatibility of features, so that the final feature representation not only focuses on the effective discriminant information, but also manifests the relative position relationship between significant objects. Extensive experiments have proved that the method in this paper effectively solves the problem of insufficient feature representation in scene recognition tasks, and improves the accuracy of scene recognition.

引用

页数：21

共 50 条

[21] Graph attention mechanism with global contextual information for multi-label image recognition
Ban, Xiaoxiao
Li, Peihua
Wang, Qilong
Zhou, Shoujun
Guo, Shijie
Wang, Yuanquan
JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
[22] Facial Expression Recognition: One Attention-Modulated Contextual Spatial Information Network
Li, Xue
Zhu, Chunhua
Zhou, Fei
ENTROPY, 2022, 24 (07)
[23] Fusing Facial Texture Features for Face Recognition
Yanqing Shao
Chaowei Tang
Min Xiao
Hui Tang
Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 2016, 86 : 395 - 403
[24] Selective attention to contextual information in Japan
Ishii, K
Kitayama, S
PROCEEDINGS OF THE TWENTY-FIFTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, PTS 1 AND 2, 2003, : 1358 - 1358
[25] Violence Detection Through Fusing Visual Information to Auditory Scene
Li, Hongwei
Ma, Lin
Min, Xinyu
Li, Haifeng
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 208 - 220
[26] Deep Contextual Stroke Pooling for Scene Character Recognition
Zhang, Zhong
Wang, Hong
Liu, Shuang
Xiao, Baihua
IEEE ACCESS, 2018, 6 : 16454 - 16463
[27] CRABR-Net: A Contextual Relational Attention-Based Recognition Network for Remote Sensing Scene Objective
Guo, Ningbo
Jiang, Mingyong
Gao, Lijing
Tang, Yizhuo
Han, Jinwei
Chen, Xiangning
SENSORS, 2023, 23 (17)
[28] Fusing binaural sonar information for object recognition
Kue, R
MF '96 - 1996 IEEE/SICE/RSJ INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS, 1996, : 727 - 735
[29] Intrinsic and contextual features in object recognition
Schlangen, Derrick
Barenholtz, Elan
JOURNAL OF VISION, 2015, 15 (01):
[30] Speech Emotion Recognition Model Based on Attention CNN Bi-GRU Fusing Visual Information
Hu, Zhangfang
Wang, Lan
Luo, Yuan
Xia, Yanling
Xiao, Hang
ENGINEERING LETTERS, 2022, 30 (02)

← 1 2 3 4 5 →