RTNet: An End-to-End Method for Handwritten Text Image Translation

被引:4
|
作者
Su, Tonghua [1 ]
Liu, Shuchen [1 ]
Zhou, Shengjie [1 ]
机构
[1] Harbin Inst Technol, Sch Software, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; Text recognition; Image text translation; Handwritten text; End-to-End;
D O I
10.1007/978-3-030-86331-9_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text image recognition and translation have a wide range of applications. It is straightforward to work out a two-stage approach: first perform the text recognition, then translate the text to target language. The handwritten text recognition model and the machine translation model are trained separately. Any transcription error may degrade the translation quality. This paper proposes an end-to-end leaning architecture that directly translates English handwritten text in images into Chinese. The handwriting recognition task and translation task are combined in a unified deep learning model. Firstly we conduct a visual encoding, next bridge the semantic gaps using a feature transformer and finally present a textual decoder to generate the target sentence. To train the model effectively, we use transfer learning to improve the generalization of the model under low-resource conditions. The experiments are carried out to compare our method to the traditional two-stage one. The results indicate that the performance of end-to-end model greatly improved as the amount of training data increases. Furthermore, when larger amount of training data is available, the end-to-end model is more advantageous.
引用
收藏
页码:99 / 113
页数:15
相关论文
共 50 条
  • [41] End-to-end image compression method based on perception metric
    Liu, Shuai
    Huang, Yingcong
    Yang, Huoxiang
    Liang, Yongsheng
    Liu, Wei
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (07) : 1803 - 1810
  • [42] An End-to-End Image Dehazing Method Based on Deep Learning
    Zhang, Yi
    Huang, Hongbing
    Liu, Junyi
    Fan, Chao
    Wang, Yanyan
    Cai, Qing
    Ruan, Yingying
    Gong, Xiaojin
    2018 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION, IMAGE AND SIGNAL PROCESSING, 2019, 1169
  • [43] End-to-End Optical Character Recognition for Bengali Handwritten Words
    Safir, Farisa Benta
    Ohi, Abu Quwsar
    Mridha, M. F.
    Monowar, Muhammad Mostafa
    Hamid, Md Abdul
    2021 IEEE NATIONAL COMPUTING COLLEGES CONFERENCE (NCCC 2021), 2021, : 1067 - +
  • [44] End-to-End Video Text Spotting with Transformer
    Wu, Weijia
    Cai, Yuanqiang
    Shen, Chunhua
    Zhang, Debing
    Fu, Ying
    Zhou, Hong
    Luo, Ping
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 4019 - 4035
  • [45] AutoText: An End-to-End AutoAI Framework for Text
    Chaudhary, Arunima
    Issak, Alayt
    Kate, Kiran
    Katsis, Yannis
    Valente, Abel
    Wang, Dakuo
    Evfimievski, Alexandre
    Gurajada, Sairam
    Kawas, Ban
    Malossi, Cristiano
    Popa, Lucian
    Pedapati, Tejaswini
    Samulowitz, Horst
    Wistuba, Martin
    Li, Yunyao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16001 - 16003
  • [46] Towards Unconstrained End-to-End Text Spotting
    Qin, Siyang
    Bissacco, Alessandro
    Raptis, Michalis
    Fujii, Yasuhisa
    Xiao, Ying
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4703 - 4713
  • [47] Scene text spotting based on end-to-end
    Wei G.
    Rong W.
    Liang Y.
    Xiao X.
    Liu X.
    Journal of Intelligent and Fuzzy Systems, 2021, 40 (05): : 8871 - 8881
  • [48] End-to-End Neural Text Classification for Tibetan
    Qun, Nuo
    Li, Xing
    Qiu, Xipeng
    Huang, Xuanjing
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 472 - 480
  • [49] EraseNet: End-to-End Text Removal in the Wild
    Liu, Chongyu
    Liu, Yuliang
    Jin, Lianwen
    Zhang, Shuaitao
    Luo, Canjie
    Wang, Yongpan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8760 - 8775
  • [50] End-to-End Differentiable GANs for Text Generation
    Kumar, Sachin
    Tsvetkov, Yulia
    NEURIPS WORKSHOPS, 2020, 2020, 137 : 118 - 128