Text Gestalt: Stroke-Aware Scene Text Image Super-resolution

被引:0
|
作者
Chen, Jingye [1 ]
Yu, Haiyang [1 ]
Ma, Jianqi [2 ]
Li, Bin [1 ]
Xue, Xiangyang [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
RECOGNITION; NETWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last decade, the blossom of deep learning has witnessed the rapid development of scene text recognition. However, the recognition of low-resolution scene text images remains a challenge. Even though some super-resolution methods have been proposed to tackle this problem, they usually treat text images as general images while ignoring the fact that the visual quality of strokes (the atomic unit of text) plays an essential role for text recognition. According to Gestalt Psychology, humans are capable of composing parts of details into the most similar objects guided by prior knowledge. Likewise, when humans observe a low-resolution text image, they will inherently use partial stroke-level details to recover the appearance of holistic characters. Inspired by Gestalt Psychology, we put forward a Stroke-Aware Scene Text Image Super-Resolution method containing a Stroke-Focused Module (SFM) to concentrate on stroke-level internal structures of characters in text images. Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues with the purpose of controlling the consistency between the generated super-resolution image and high-resolution ground truth. The extensive experimental results validate that the proposed method can indeed generate more distinguishable images on Text-Zoom and manually constructed Chinese character dataset Degraded-IC13. Furthermore, since the proposed SFM is only used to provide stroke-level guidance when training, it will not bring any time overhead during the test phase.
引用
收藏
页码:285 / 293
页数:9
相关论文
共 50 条
  • [41] TextDiff: Enhancing scene text image super-resolution with mask-guided residual diffusion models
    Liu, Baolin
    Yang, Zongyuan
    Chiu, Chinwai
    Xiong, Yongping
    PATTERN RECOGNITION, 2025, 164
  • [42] TextSRNet: Scene Text Super-Resolution Based on Contour Prior and Atrous Convolution
    Ma, Jizhao
    Jin, Lianwen
    Zhang, Jiaxin
    Jiang, Jiajia
    Xue, Yang
    He, Mengchao
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3252 - 3258
  • [43] Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text Image Super-Resolution
    Zhang, Wenyu
    Deng, Xin
    Jia, Baojun
    Yu, Xingtong
    Chen, Yifan
    Ma, Jin
    Ding, Qing
    Zhang, Xinming
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2168 - 2179
  • [44] Super-Resolution of Text Image Based on Conditional Generative Adversarial Network
    Wang, Yuyang
    Ding, Wenjun
    Su, Feng
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 270 - 281
  • [45] Learning Generative Structure Prior for Blind Text Image Super-resolution
    Li, Xiaoming
    Zuo, Wangmeng
    Loy, Chen Change
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10103 - 10113
  • [46] Pixel-Level Degradation for Text Image Super-Resolution and Recognition
    Qian, Xiaohong
    Xie, Lifeng
    Ye, Ning
    Le, Renlong
    Yang, Shengying
    ELECTRONICS, 2023, 12 (21)
  • [47] CNN-Based Text Image Super-Resolution Tailored for OCR
    Zhang, Haochen
    Liu, Dong
    Xiong, Zhiwei
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
  • [48] Coarse-to-fine text injecting for realistic image super-resolution
    Chen, Xiaoyu
    Bai, Chao
    Wu, Zhenyao
    Wu, Xinyi
    Zou, Qi
    Xia, Yong
    Wang, Song
    NEUROCOMPUTING, 2025, 626
  • [49] Scene text image super-resolution using multi-scale convolutional neural network with skip connections
    Walha, Rim
    Aouini, Amal
    APPLIED INTELLIGENCE, 2024, : 5931 - 5943
  • [50] Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text Image Super-Resolution
    Zhang, Wenyu
    Deng, Xin
    Jia, Baojun
    Yu, Xingtong
    Chen, Yifan
    Ma, Jin
    Ding, Qing
    Zhang, Xinming
    MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, 2023, : 2168 - 2179