Text Prior Guided Scene Text Image Super-Resolution

被引:30
|
作者
Ma, Jianqi [1 ]
Guo, Shi [1 ]
Zhang, Lei [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
关键词
Scene text image super-resolution; super-resolution; text prior; NETWORK; RECOGNITION;
D O I
10.1109/TIP.2023.3237002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene text image super-resolution (STISR) aims to improve the resolution and visual quality of low-resolution (LR) scene text images, while simultaneously boost the performance of text recognition. However, most of the existing STISR methods regard text images as natural scene images, ignoring the categorical information of text. In this paper, we make an inspiring attempt to embed text recognition prior into STISR model. Specifically, we adopt the predicted character recognition probability sequence as the text prior, which can be obtained conveniently from a text recognition model. The text prior provides categorical guidance to recover high-resolution (HR) text images. On the other hand, the reconstructed HR image can refine the text prior in return. Finally, we present a multi-stage text prior guided super-resolution (TPGSR) framework for STISR. Our experiments on the benchmark TextZoom dataset show that TPGSR can not only effectively improve the visual quality of scene text images, but also significantly improve the text recognition accuracy over existing STISR methods. Our model trained on TextZoom also demonstrates certain generalization capability to the LR images in other datasets. The source code of our work is available
引用
收藏
页码:1341 / 1353
页数:13
相关论文
共 50 条
  • [1] GARDEN: Generative Prior Guided Network for Scene Text Image Super-Resolution
    Kong, Yuxin
    Ma, Weihong
    Jin, Lianwen
    Xue, Yang
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 196 - 214
  • [2] Perceiving Multiple Representations for scene text image super-resolution guided by text recognizer
    Shi, Qin
    Zhu, Yu
    Liu, Yatong
    Ye, Jiongyao
    Yang, Dawei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [3] More and Less: Enhancing Abundance and Refining Redundancy for Text-Prior-Guided Scene Text Image Super-Resolution
    Yang, Wei
    Luo, Yihong
    Ibrayim, Mayire
    Hamdulla, Askar
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT V, 2024, 14808 : 129 - 146
  • [4] Scene Text Telescope: Text-Focused Scene Image Super-Resolution
    Chen, Jingye
    Li, Bin
    Xue, Xiangyang
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12021 - 12030
  • [5] Text Image Super-Resolution Guided by Text Structure and Embedding Priors
    Huang, Cong
    Peng, Xiulian
    Liu, Dong
    Lu, Yan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [6] Text Gestalt: Stroke-Aware Scene Text Image Super-resolution
    Chen, Jingye
    Yu, Haiyang
    Ma, Jianqi
    Li, Bin
    Xue, Xiangyang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 285 - 293
  • [7] Improving Scene Text Image Super-resolution via Dual Prior Modulation Network
    Zhu, Shipeng
    Zhao, Zuoyan
    Fang, Pengfei
    Xue, Hui
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3843 - 3851
  • [8] Batch-transformer for scene text image super-resolution
    Sun, Yaqi
    Xie, Xiaolan
    Li, Zhi
    Yang, Kai
    VISUAL COMPUTER, 2024, 40 (10): : 7399 - 7409
  • [9] A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution
    Ma, Jianqi
    Liang, Zhetong
    Zhang, Lei
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5901 - 5910
  • [10] Scene Text Image Super-Resolution Via Semantic Distillation and Text Perceptual Loss
    Zhao, Cairong
    Shu, Rui
    Feng, Shuyang
    Zhu, Liang
    Wang, Xuekuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1153 - 1164