TSRGAN: Real-world text image super-resolution based on adversarial learning and triplet attention

被引:19
|
作者
Fang, Chuantao [1 ]
Zhu, Yu [1 ]
Liao, Lei [1 ]
Ling, Xiaofeng [1 ]
机构
[1] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
基金
上海市自然科学基金;
关键词
Text image super-resolution; Adversarial learning; Triplet attention; Wavelet loss; Scene text recognition; NEURAL-NETWORK; SCENE;
D O I
10.1016/j.neucom.2021.05.060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The text in a low-resolution (LR) image is usually hard to read. Super-resolution (SR) is an intuitive solution to this issue. Existing single image super-resolution (SISR) models are mainly trained on synthetic datasets whose LR images are obtained by performing bicubic interpolation or gaussian blur on high-resolution (HR) images. However, these models can hardly generalize to practical scenarios because real-world LR images are more difficult to super-resolve. The newly proposed TextZoom dataset is the first dataset for real-world text image super-resolution. We propose a new model termed TSRGAN trained on this dataset. First, a discriminator is designed to prevent the SR network from generating over-smoothed images. Second, we introduce triplet attention into the SR network for better representational ability. Moreover, besides L-2 loss and adversarial loss, wavelet loss is incorporated to help reconstruct sharper character edges. Since TextZoom provides text labels, the recognition accuracy of scene text recognition (STR) model can be used to evaluate the quality of SR images. It can reflect the performance of text image SR models better than traditional SR evaluation metrics such as PSNR and SSIM. Comprehensive experiments show the superiority of our TSRGAN. Compared with the state-of-the-art method, the proposed TSRGAN improves the average recognition accuracy of ASTER, MORAN and CRNN by 0.8%, 1.5% and 3.2% on TextZoom respectively. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:88 / 96
页数:9
相关论文
共 50 条
  • [21] Taylor Neural Network for Real-World Image Super-Resolution
    Wei, Pengxu
    Xie, Ziwei
    Li, Guanbin
    Lin, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1942 - 1951
  • [22] Exploring contextual priors for real-world image super-resolution
    Wu, Shixiang
    Dong, Chao
    Qiao, Yu
    COMPUTATIONAL VISUAL MEDIA, 2025, 11 (01): : 159 - 177
  • [23] Super-Resolution of Text Image Based on Conditional Generative Adversarial Network
    Wang, Yuyang
    Ding, Wenjun
    Su, Feng
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 270 - 281
  • [24] Confidence-Based Iterative Generation for Real-World Image Super-Resolution
    Peng, Jialun
    Luo, Xin
    Fu, Jingjing
    Liu, Dong
    COMPUTER VISION - ECCV 2024, PT LXV, 2025, 15123 : 323 - 341
  • [25] Exploiting Degradation Prior for Personalized Federated Learning in Real-World Image Super-Resolution
    Yang, Yue
    Ke, Liangjun
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 146 - 154
  • [26] RSRGAN: computationally efficient real-world single image super-resolution using generative adversarial network
    Chudasama, Vishal
    Upla, Kishor
    MACHINE VISION AND APPLICATIONS, 2020, 32 (01)
  • [27] RSRGAN: computationally efficient real-world single image super-resolution using generative adversarial network
    Vishal Chudasama
    Kishor Upla
    Machine Vision and Applications, 2021, 32
  • [28] Learning the Frequency Domain Aliasing for Real-World Super-Resolution
    Hao, Yukun
    Yu, Feihong
    ELECTRONICS, 2024, 13 (02)
  • [29] Deep Generative Adversarial Residual Convolutional Networks for Real-World Super-Resolution
    Umer, Rao Muhammad
    Foresti, Gian Luca
    Micheloni, Christian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1769 - 1777
  • [30] Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion
    Lu, Xinhua
    Wei, Haihai
    Ma, Li
    Xue, Qingji
    Fu, Yonghui
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2023, 19 (04): : 427 - 438