Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

被引:0
|
作者
Chen, Jingye [1 ]
Yu, Haiyang [1 ]
Ma, Jianqi [2 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
RECOGNITION; NETWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last decade, the blossom of deep learning has witnessed the rapid development of scene text recognition. However, the recognition of low-resolution scene text images remains a challenge. Even though some super-resolution methods have been proposed to tackle this problem, they usually treat text images as general images while ignoring the fact that the visual quality of strokes (the atomic unit of text) plays an essential role for text recognition. According to Gestalt Psychology, humans are capable of composing parts of details into the most similar objects guided by prior knowledge. Likewise, when humans observe a low-resolution text image, they will inherently use partial stroke-level details to recover the appearance of holistic characters. Inspired by Gestalt Psychology, we put forward a Stroke-Aware Scene Text Image Super-Resolution method containing a Stroke-Focused Module (SFM) to concentrate on stroke-level internal structures of characters in text images. Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues with the purpose of controlling the consistency between the generated super-resolution image and high-resolution ground truth. The extensive experimental results validate that the proposed method can indeed generate more distinguishable images on Text-Zoom and manually constructed Chinese character dataset Degraded-IC13. Furthermore, since the proposed SFM is only used to provide stroke-level guidance when training, it will not bring any time overhead during the test phase.
引用
收藏
页码:285 / 293
页数:9
相关论文
共 50 条
  • [11] Scene Text Image Super-Resolution via Parallelly Contextual Attention Network
    Zhao, Cairong
    Feng, Shuyang
    Zhao, Brian Nlong
    Ding, Zhijun
    Wu, Jun
    Shen, Fuming
    Shen, Heng Tao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2908 - 2917
  • [12] Self-supervised memory learning for scene text image super-resolution
    Guo, Kehua
    Zhu, Xiangyuan
    Schaefer, Gerald
    Ding, Rui
    Fang, Hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [13] Gradient-Based Graph Attention for Scene Text Image Super-resolution
    Zhu, Xiangyuan
    Guo, Kehua
    Fang, Hui
    Ding, Rui
    Wu, Zheng
    Schaefer, Gerald
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3861 - 3869
  • [14] Advancing scene text image super-resolution via edge enhancement priors
    Li, Hongjun
    Li, Shangfeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 8241 - 8250
  • [15] Super-resolution enhancement of text image sequences
    Capel, D
    Zisserman, A
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS: COMPUTER VISION AND IMAGE ANALYSIS, 2000, : 600 - 605
  • [16] TLWSR: Weakly supervised real-world scene text image super-resolution using text label
    Shi, Qin
    Zhu, Yu
    Fang, Chuantao
    Yang, Dawei
    IET IMAGE PROCESSING, 2023, 17 (09) : 2780 - 2790
  • [17] Text Image Super-Resolution Guided by Text Structure and Embedding Priors
    Huang, Cong
    Peng, Xiulian
    Liu, Dong
    Lu, Yan
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [18] Better Skeleton Better Readability: Scene Text Image Super-Resolution via Skeleton-Aware Diffusion Model
    Singh, Shrey
    Keserwani, Prateek
    Roy, Partha Pratim
    Saini, Rajkumar
    IEEE ACCESS, 2024, 12 : 187640 - 187651
  • [19] HiREN: Towards higher supervision quality for better scene text image super-resolution
    Zhao, Minyi
    Xu, Yi
    Li, Bingjia
    Wang, Jie
    Guan, Jihong
    Zhou, Shuigeng
    NEUROCOMPUTING, 2025, 623
  • [20] DCDM: Diffusion-Conditioned-Diffusion Model for Scene Text Image Super-Resolution
    Singh, Shrey
    Keserwani, Prateek
    Iwamura, Masakazu
    Roy, Partha Pratim
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 303 - 320