Residual attention-based multi-scale script identification in scene text images

被引:0
|
作者
Ma M. [1 ]
Wang Q.-F. [1 ]
Huang S. [2 ]
Huang S. [2 ]
Goulermas Y. [3 ]
Huang K. [1 ]
机构
[1] Department of Intelligent Science, School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou
[2] Tencent Technology Co. Ltd, Beijing
[3] Department of Computer Science, University of Liverpool, Liverpool
基金
中国国家自然科学基金;
关键词
Attention mechanism; Feature fusion; Global max pooling; Multi-scale features; Script identification;
D O I
10.1016/j.neucom.2020.09.015
中图分类号
学科分类号
摘要
Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%, 96.11%, 98.78% and 97.20% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively. © 2020 Elsevier B.V.
引用
收藏
页码:222 / 233
页数:11
相关论文
共 50 条
  • [1] Residual attention-based multi-scale script identification in scene text images
    Ma, Mengkai
    Wang, Qiu-Feng
    Huang, Shan
    Huang, Shen
    Goulermas, Yannis
    Huang, Kaizhu
    NEUROCOMPUTING, 2021, 421 : 222 - 233
  • [2] Scene Text Removal Based on Multi-scale Attention Mechanism
    He, Ping
    Zhang, Heng
    Liu, Chenglin
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (07): : 614 - 624
  • [3] Multi-scale Attention-Based Few-Shot Hyperspectral Images Classification
    Ding, Lanwei
    Cao, Guo
    Xu, Ling
    Deng, Lindiao
    Xu, Hao
    Pan, Qikun
    Shang, Yanfeng
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [4] Multi-Scene Mask Detection Based on Multi-Scale Residual and Complementary Attention Mechanism
    Zhou, Yuting
    Lin, Xin
    Luo, Shi
    Ding, Sixian
    Xiao, Luyang
    Ren, Chao
    SENSORS, 2023, 23 (21)
  • [5] MS-ROCANET: MULTI-SCALE RESIDUAL ORTHOGONAL-CHANNEL ATTENTION NETWORK FOR SCENE TEXT DETECTION
    Liu, Jinpeng
    Wu, Song
    He, Dehong
    Xiao, Guoqiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2200 - 2204
  • [6] Attention-based multi-scale recursive residual network for low-light image enhancement
    Wang, Kaidi
    Zheng, Yuanlin
    Liao, Kaiyang
    Liu, Haiwen
    Sun, Bangyong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2521 - 2531
  • [7] Attention-based multi-scale recursive residual network for low-light image enhancement
    Kaidi Wang
    Yuanlin Zheng
    Kaiyang Liao
    Haiwen Liu
    Bangyong Sun
    Signal, Image and Video Processing, 2024, 18 : 2521 - 2531
  • [8] A Multi-Scale Graph Attention-Based Transformer for Occluded Person Re-Identification
    Ma, Ming
    Wang, Jianming
    Zhao, Bohan
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [9] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
    Lu, Xinhua
    Wei, Haihai
    Ma, Li
    Xue, Qingji
    Fu, Yonghui
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438
  • [10] Multi-scale Information Fusion Combined with Residual Attention for Text Detection
    Zhao, Wenxiu
    Dongye, Changlei
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 506 - 518