Scene Text Telescope: Text-Focused Scene Image Super-Resolution

被引:75
|
作者
Chen, Jingye [1 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
关键词
NEURAL-NETWORK; ATTENTION NETWORK; RECOGNITION;
D O I
10.1109/CVPR46437.2021.01185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image. It has always been challenging due to large variations in text shapes, fonts, backgrounds, etc. However, most existing methods employ generic super-resolution frameworks to handle scene text images while ignoring text-specific properties such as text-level layouts and character-level details. In this paper, we establish a text-focused super-resolution framework, called Scene Text Telescope (STT). In terms of text-level layouts, we propose a Transformer-Based Super-Resolution Network (TBSRN) containing a Self-Attention Module to extract sequential information, which is robust to tackle the texts in arbitrary orientations. In terms of character-level details, we propose a Position-Aware Module and a Content-Aware Module to highlight the position and the content of each character. By observing that some characters look indistinguishable in low-resolution conditions, we use a weighted cross-entropy loss to tackle this problem. We conduct extensive experiments, including text recognition with pre-trained recognizers and image quality evaluation, on TextZoom and several scene text recognition benchmarks to assess the super-resolution images. The experimental results show that our STT can indeed generate text-focused super-resolution images and outperform the existing methods in terms of recognition accuracy.
引用
收藏
页码:12021 / 12030
页数:10
相关论文
共 50 条
  • [21] Parametric loss-based super-resolution for scene text recognition
    Supatta Viriyavisuthisakul
    Parinya Sanguansat
    Teeradaj Racharak
    Minh Le Nguyen
    Natsuda Kaothanthong
    Choochart Haruechaiyasak
    Toshihiko Yamasaki
    Machine Vision and Applications, 2023, 34
  • [22] More and Less: Enhancing Abundance and Refining Redundancy for Text-Prior-Guided Scene Text Image Super-Resolution
    Yang, Wei
    Luo, Yihong
    Ibrayim, Mayire
    Hamdulla, Askar
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 129 - 146
  • [23] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Lan Yu
    Xiaojie Li
    Qi Yu
    Guangju Li
    Dehu Jin
    Meng Qi
    Applied Intelligence, 2024, 54 : 1997 - 2008
  • [24] Text-Enhanced Scene Image Super-Resolution via Stroke Mask and Orthogonal Attention
    Shu, Rui
    Zhao, Cairong
    Feng, Shuyang
    Zhu, Liang
    Miao, Duoqian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6317 - 6330
  • [25] Pragmatic degradation learning for scene text image super-resolution with data-training strategy
    Yang, Shengying
    Xie, Lifeng
    Ran, Xiaoxiao
    Lei, Jingsheng
    Qian, Xiaohong
    KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [26] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
    Yu, Lan
    Li, Xiaojie
    Yu, Qi
    Li, Guangju
    Jin, Dehu
    Qi, Meng
    APPLIED INTELLIGENCE, 2024, 54 (02) : 1997 - 2008
  • [27] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
    Lu, Xinhua
    Wei, Haihai
    Ma, Li
    Xue, Qingji
    Fu, Yonghui
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438
  • [28] TextSRNet: Scene Text Super-Resolution Based on Contour Prior and Atrous Convolution
    Ma, Jizhao
    Jin, Lianwen
    Zhang, Jiaxin
    Jiang, Jiajia
    Xue, Yang
    He, Mengchao
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3252 - 3258
  • [29] Scene Text Image Super-Resolution Through Multi-Scale Interaction of Structural and Semantic Priors
    Zhu Z.
    Zhang L.
    Bai Y.
    Wang Y.
    Li P.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (07): : 1 - 11
  • [30] Scene Text Image Super-Resolution Reconstruction Based on Perceiving Multi-Domain Character Distance
    Huang, Jun-Yang
    Chen, Hong-Hui
    Wang, Jia-Bao
    Chen, Ping-Ping
    Lin, Zhi-Jian
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (07): : 2262 - 2270