Scene Text Telescope: Text-Focused Scene Image Super-Resolution

被引:75
|
作者
Chen, Jingye [1 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
关键词
NEURAL-NETWORK; ATTENTION NETWORK; RECOGNITION;
D O I
10.1109/CVPR46437.2021.01185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image. It has always been challenging due to large variations in text shapes, fonts, backgrounds, etc. However, most existing methods employ generic super-resolution frameworks to handle scene text images while ignoring text-specific properties such as text-level layouts and character-level details. In this paper, we establish a text-focused super-resolution framework, called Scene Text Telescope (STT). In terms of text-level layouts, we propose a Transformer-Based Super-Resolution Network (TBSRN) containing a Self-Attention Module to extract sequential information, which is robust to tackle the texts in arbitrary orientations. In terms of character-level details, we propose a Position-Aware Module and a Content-Aware Module to highlight the position and the content of each character. By observing that some characters look indistinguishable in low-resolution conditions, we use a weighted cross-entropy loss to tackle this problem. We conduct extensive experiments, including text recognition with pre-trained recognizers and image quality evaluation, on TextZoom and several scene text recognition benchmarks to assess the super-resolution images. The experimental results show that our STT can indeed generate text-focused super-resolution images and outperform the existing methods in terms of recognition accuracy.
引用
收藏
页码:12021 / 12030
页数:10
相关论文
共 50 条
  • [41] QT-TextSR: Enhancing scene text image super-resolution via efficient interaction with text recognition using a Query-aware Transformer
    Liu, Chongyu
    Jiang, Qing
    Peng, Dezhi
    Kong, Yuxin
    Zhang, Jiaixin
    Xiong, Longfei
    Duan, Jiwei
    Sun, Cheng
    Jin, Lianwen
    NEUROCOMPUTING, 2025, 620
  • [42] SECANet: A structure-enhanced attention network with dual-domain contrastive learning for scene text image super-resolution
    He, Xin
    Zhang, Kaibing
    Zhang, Yuhong
    Zhang, Hui
    ELECTRONICS LETTERS, 2023, 59 (24)
  • [43] Unveiling the Influence of Image Super-Resolution on Aerial Scene Classification
    Ibrahim, Mohamed Ramzy
    Benavente, Robert
    Ponsa, Daniel
    Lumbreras, Felipe
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 214 - 228
  • [44] Scene Text Aware Image Retargeting
    Patel, Diptiben
    Raman, Shanmuganathan
    2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
  • [45] Measuring Text-Focused Reading Instruction
    Cohen, Julie
    Miller, Luke C.
    Chung, Rosalie
    Wiseman, Emily
    Ruzek, Erik
    JOURNAL OF EDUCATION-US, 2024, 204 (04): : 719 - 738
  • [46] The text and scene
    Auchet, Marc
    Sarrazac, Jean-Pierre
    ETUDES GERMANIQUES, 2007, 62 (04): : 763 - 764
  • [47] Controllable Text Layout Generation For Synthesizing Scene Text Image
    Chen, Huen
    He, Jiangyang
    Zhu, Anna
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 147 - 161
  • [48] Image and Text: Fighting the Same Battle? Super-resolution Learning for Imbalanced Text Classification
    Meunier, Romain
    Benamar, Farah
    Moriceau, Veronique
    Stolfl, Patricia
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10707 - 10720
  • [49] Criteria Comparative Learning for Real-Scene Image Super-Resolution
    Shi, Yukai
    Li, Hao
    Zhang, Sen
    Yang, Zhijing
    Wang, Xiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8476 - 8485
  • [50] ICDAR2015 Competition on Text Image Super-Resolution
    Peyrard, Clement
    Baccouche, Moez
    Mamalet, Franck
    Garcia, Christophe
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1201 - 1205