Weakly supervised scene text generation for low-resource languages

被引:3
|
作者
Xie, Yangchen [1 ]
Chen, Xinyuan [2 ]
Zhan, Hongjian [1 ,3 ]
Shivakumara, Palaiahnakote [4 ]
Yin, Bing [5 ]
Liu, Cong [5 ]
Lu, Yue [1 ]
机构
[1] East China Normal Univ, Sch Commun & Elect Engn, Shanghai 200241, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
[3] Chongqing Inst East China Normal Univ, Chongqing Key Lab Precis Opt, Chongqing 401120, Peoples R China
[4] Univ Malaya, Fac Comp Sci & Informat Technol FSKTM, Kuala Lumpur, Malaysia
[5] iFLYTEK Res, Hefei, Peoples R China
关键词
Scene text generation; Style transfer; Low-resource languages;
D O I
10.1016/j.eswa.2023.121622
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large number of annotated training images is crucial for training successful scene text recognition models. However, collecting sufficient datasets can be a labor-intensive and costly process, particularly for low-resource languages. To address this challenge, auto-generating text data has shown promise in alleviating the problem. Unfortunately, existing scene text generation methods typically rely on a large amount of paired data, which is difficult to obtain for low-resource languages. In this paper, we propose a novel weakly supervised scene text generation method that leverages a few recognition-level labels as weak supervision. The proposed method can generate a large amount of scene text images with diverse backgrounds and font styles through cross -language generation. Our method disentangles the content and style features of scene text images, with the former representing textual information and the latter representing characteristics such as font, alignment, and background. To preserve the complete content structure of generated images, we introduce an integrated attention module. Furthermore, to bridge the style gap in the style of different languages, we incorporate a pre-trained font classifier. We evaluate our method using state-of-the-art scene text recognition models. Experiments demonstrate that our generated scene text significantly improves the scene text recognition accuracy and helps achieve higher accuracy when complemented with other generative methods.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages
    Kann, Katharina
    Lacroix, Ophelie
    Sogaard, Anders
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8066 - 8073
  • [2] Efficient Entity Candidate Generation for Low-Resource Languages
    Garcia-Duran, Alberto
    Arora, Akhil
    West, Robert
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6429 - 6438
  • [3] XAlign: Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages
    Abhishek, Tushar
    Sagare, Shivprasad
    Singh, Bhavyajeet
    Sharma, Anubhav
    Gupta, Manish
    Varma, Vasudeva
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 171 - 175
  • [4] Hybrid Approach Text Generation for Low-Resource Language
    Rakhimova, Diana
    Adali, Esref
    Karibayeva, Aidana
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2024, PART I, 2024, 2165 : 256 - 268
  • [5] Hybrid Encoding Method for Scene Text Recognition in Low-Resource Uyghur
    Xu, Miaomiao
    Zhang, Jiang
    Xu, Lianghui
    Li, Yanbing
    Silamu, Wushour
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 86 - 99
  • [6] Harnessing Knowledge Distillation for Enhanced Text-to-Text Translation in Low-Resource Languages
    Ahmed, Manar Ouled
    Ming, Zuheng
    Othmani, Alice
    SPEECH AND COMPUTER, SPECOM 2024, PT II, 2025, 15300 : 295 - 307
  • [7] Exploring low-resource medical image classification with weakly supervised prompt learning
    Zheng, Fudan
    Cao, Jindong
    Yu, Weijiang
    Chen, Zhiguang
    Xiao, Nong
    Lu, Yutong
    PATTERN RECOGNITION, 2024, 149
  • [8] SUPERVISED AND UNSUPERVISED ACTIVE LEARNING FOR AUTOMATIC SPEECH RECOGNITION OF LOW-RESOURCE LANGUAGES
    Syed, Ali Raza
    Rosenberg, Andrew
    Kislal, Ellen
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5320 - 5324
  • [9] Weakly Supervised Attention Rectification for Scene Text Recognition
    Gu, Chengyu
    Wang, Shilin
    Zhu, Yiwei
    Huang, Zheng
    Chen, Kai
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 779 - 786
  • [10] Dual Feature Enhanced Scene Text Recognition Method for Low-Resource Uyghur
    Xu, Miaomiao
    Zhang, Jiang
    Xu, Lianghui
    Li, Yanbing
    Silamu, Wushour
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 58 - 71