Scene Text Telescope: Text-Focused Scene Image Super-Resolution

被引：75

作者：

Chen, Jingye ^{[1
]}

Li, Bin ^{[1
]}

Xue, Xiangyang ^{[1
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

NEURAL-NETWORK; ATTENTION NETWORK; RECOGNITION;

D O I：

10.1109/CVPR46437.2021.01185

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image. It has always been challenging due to large variations in text shapes, fonts, backgrounds, etc. However, most existing methods employ generic super-resolution frameworks to handle scene text images while ignoring text-specific properties such as text-level layouts and character-level details. In this paper, we establish a text-focused super-resolution framework, called Scene Text Telescope (STT). In terms of text-level layouts, we propose a Transformer-Based Super-Resolution Network (TBSRN) containing a Self-Attention Module to extract sequential information, which is robust to tackle the texts in arbitrary orientations. In terms of character-level details, we propose a Position-Aware Module and a Content-Aware Module to highlight the position and the content of each character. By observing that some characters look indistinguishable in low-resolution conditions, we use a weighted cross-entropy loss to tackle this problem. We conduct extensive experiments, including text recognition with pre-trained recognizers and image quality evaluation, on TextZoom and several scene text recognition benchmarks to assess the super-resolution images. The experimental results show that our STT can indeed generate text-focused super-resolution images and outperform the existing methods in terms of recognition accuracy.

引用

页码：12021 / 12030

页数：10

共 50 条

[21] Parametric loss-based super-resolution for scene text recognition
Supatta Viriyavisuthisakul
Parinya Sanguansat
Teeradaj Racharak
Minh Le Nguyen
Natsuda Kaothanthong
Choochart Haruechaiyasak
Toshihiko Yamasaki
Machine Vision and Applications, 2023, 34
[22] More and Less: Enhancing Abundance and Refining Redundancy for Text-Prior-Guided Scene Text Image Super-Resolution
Yang, Wei
Luo, Yihong
Ibrayim, Mayire
Hamdulla, Askar
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 129 - 146
[23] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
Lan Yu
Xiaojie Li
Qi Yu
Guangju Li
Dehu Jin
Meng Qi
Applied Intelligence, 2024, 54 : 1997 - 2008
[24] Text-Enhanced Scene Image Super-Resolution via Stroke Mask and Orthogonal Attention
Shu, Rui
Zhao, Cairong
Feng, Shuyang
Zhu, Liang
Miao, Duoqian
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6317 - 6330
[25] Pragmatic degradation learning for scene text image super-resolution with data-training strategy
Yang, Shengying
Xie, Lifeng
Ran, Xiaoxiao
Lei, Jingsheng
Qian, Xiaohong
KNOWLEDGE-BASED SYSTEMS, 2024, 285
[26] Scene text image super-resolution via textual reasoning and multiscale cross-convolution
Yu, Lan
Li, Xiaojie
Yu, Qi
Li, Guangju
Jin, Dehu
Qi, Meng
APPLIED INTELLIGENCE, 2024, 54 (02) : 1997 - 2008
[27] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
Lu, Xinhua
Wei, Haihai
Ma, Li
Xue, Qingji
Fu, Yonghui
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438
[28] TextSRNet: Scene Text Super-Resolution Based on Contour Prior and Atrous Convolution
Ma, Jizhao
Jin, Lianwen
Zhang, Jiaxin
Jiang, Jiajia
Xue, Yang
He, Mengchao
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3252 - 3258
[29] Scene Text Image Super-Resolution Through Multi-Scale Interaction of Structural and Semantic Priors
Zhu Z.
Zhang L.
Bai Y.
Wang Y.
Li P.
IEEE Transactions on Artificial Intelligence, 2024, 5 (07): : 1 - 11
[30] Scene Text Image Super-Resolution Reconstruction Based on Perceiving Multi-Domain Character Distance
Huang, Jun-Yang
Chen, Hong-Hui
Wang, Jia-Bao
Chen, Ping-Ping
Lin, Zhi-Jian
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (07): : 2262 - 2270

← 1 2 3 4 5 →