Weakly supervised scene text generation for low-resource languages

被引：3

作者：

Xie, Yangchen ^{[1
]}

Chen, Xinyuan ^{[2
]}

Zhan, Hongjian ^{[1
,3
]}

Shivakumara, Palaiahnakote ^{[4
]}

Yin, Bing ^{[5
]}

Liu, Cong ^{[5
]}

Lu, Yue ^{[1
]}

机构：

[1] East China Normal Univ, Sch Commun & Elect Engn, Shanghai 200241, Peoples R China

[2] Shanghai AI Lab, Shanghai, Peoples R China

[3] Chongqing Inst East China Normal Univ, Chongqing Key Lab Precis Opt, Chongqing 401120, Peoples R China

[4] Univ Malaya, Fac Comp Sci & Informat Technol FSKTM, Kuala Lumpur, Malaysia

[5] iFLYTEK Res, Hefei, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 237卷

关键词：

Scene text generation; Style transfer; Low-resource languages;

D O I：

10.1016/j.eswa.2023.121622

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A large number of annotated training images is crucial for training successful scene text recognition models. However, collecting sufficient datasets can be a labor-intensive and costly process, particularly for low-resource languages. To address this challenge, auto-generating text data has shown promise in alleviating the problem. Unfortunately, existing scene text generation methods typically rely on a large amount of paired data, which is difficult to obtain for low-resource languages. In this paper, we propose a novel weakly supervised scene text generation method that leverages a few recognition-level labels as weak supervision. The proposed method can generate a large amount of scene text images with diverse backgrounds and font styles through cross -language generation. Our method disentangles the content and style features of scene text images, with the former representing textual information and the latter representing characteristics such as font, alignment, and background. To preserve the complete content structure of generated images, we introduce an integrated attention module. Furthermore, to bridge the style gap in the style of different languages, we incorporate a pre-trained font classifier. We evaluate our method using state-of-the-art scene text recognition models. Experiments demonstrate that our generated scene text significantly improves the scene text recognition accuracy and helps achieve higher accuracy when complemented with other generative methods.

引用

页数：12

共 50 条

[21] Enabling Medical Translation for Low-Resource Languages
Musleh, Ahmad
Durrani, Nadir
Temnikova, Irina
Nakov, Preslav
Vogel, Stephan
Alsaad, Osama
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 3 - 16
[22] Classifying educational materials in low-resource languages
Sohsah, Gihad N.
Guzey, Onur
Tarmanini, Zaina
2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 431 - 435
[23] GlotLID: Language Identification for Low-Resource Languages
Kargaran, Amir Hossein
Imani, Ayyoob
Yvon, Francois
Schuetze, Hinrich
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6155 - 6218
[24] Discourse annotation guideline for low-resource languages
Vargas, Francielle
Schmeisser-Nieto, Wolfgang
Rabinovich, Zohar
Pardo, Thiago A. S.
Benevenuto, Fabricio
NATURAL LANGUAGE PROCESSING, 2025, 31 (02): : 700 - 743
[25] Improving Automatic Speech Recognition Performance for Low-Resource Languages With Self-Supervised Models
Zhao, Jing
Zhang, Wei-Qiang
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1227 - 1241
[26] Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMs
Yohannes, Hailemariam Mehari
Lynden, Steven
Amagasa, Toshiyuki
Matono, Akiyoshi
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 166 - 180
[27] Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
Ziyaden, Atabay
Yelenov, Amir
Hajiyev, Fuad
Rustamov, Samir
Pak, Alexandr
PEERJ COMPUTER SCIENCE, 2024, 10
[28] Extending Multilingual BERT to Low-Resource Languages
Wang, Zihan
Karthikeyan, K.
Mayhew, Stephen
Roth, Dan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2649 - 2656
[29] Attention is all low-resource languages need
Poupard, Duncan
TRANSLATION STUDIES, 2024, 17 (02) : 424 - 427
[30] Faithful Low-Resource Data-to-Text Generation through Cycle Training
Wang, Zhuoer
Collins, Marcus
Vedula, Nikhita
Filice, Simone
Malmasi, Shervin
Rokhlenko, Oleg
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2847 - 2867

← 1 2 3 4 5 →