RTNet: An End-to-End Method for Handwritten Text Image Translation

被引:4
|
作者
Su, Tonghua [1 ]
Liu, Shuchen [1 ]
Zhou, Shengjie [1 ]
机构
[1] Harbin Inst Technol, Sch Software, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine translation; Text recognition; Image text translation; Handwritten text; End-to-End;
D O I
10.1007/978-3-030-86331-9_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text image recognition and translation have a wide range of applications. It is straightforward to work out a two-stage approach: first perform the text recognition, then translate the text to target language. The handwritten text recognition model and the machine translation model are trained separately. Any transcription error may degrade the translation quality. This paper proposes an end-to-end leaning architecture that directly translates English handwritten text in images into Chinese. The handwriting recognition task and translation task are combined in a unified deep learning model. Firstly we conduct a visual encoding, next bridge the semantic gaps using a feature transformer and finally present a textual decoder to generate the target sentence. To train the model effectively, we use transfer learning to improve the generalization of the model under low-resource conditions. The experiments are carried out to compare our method to the traditional two-stage one. The results indicate that the performance of end-to-end model greatly improved as the amount of training data increases. Furthermore, when larger amount of training data is available, the end-to-end model is more advantageous.
引用
收藏
页码:99 / 113
页数:15
相关论文
共 50 条
  • [21] Improving End-to-End Speech Translation by Leveraging Auxiliary Speech and Text Data
    Zhang, Yuhao
    Xu, Chen
    Hu, Bojie
    Zhang, Chunliang
    Xiao, Tong
    Zhu, Jingbo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13984 - 13992
  • [22] Training an End-to-End Model for Offline Handwritten Japanese Text Recognition by Generated Synthetic Patterns
    Nam Tuan Ly
    Cuong Tuan Nguyen
    Nakagawa, Masaki
    PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 74 - 79
  • [23] Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval
    Zhang, Feifei
    Xu, Mingliang
    Xu, Changsheng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (02)
  • [24] END-TO-END CHINESE TEXT RECOGNITION
    Hu, Jie
    Guo, Tszhang
    Cao, Ji
    Zhang, Changshui
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1407 - 1411
  • [25] An End-to-End Attack on Text CAPTCHAs
    Zi, Yang
    Gao, Haichang
    Cheng, Zhouhang
    Liu, Yi
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 : 753 - 766
  • [26] End-to-End Scene Text Recognition
    Wang, Kai
    Babenko, Boris
    Belongie, Serge
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 1457 - 1464
  • [27] An end-to-end text spotter with text relation networks
    Jianguo Jiang
    Baole Wei
    Min Yu
    Gang Li
    Boquan Li
    Chao Liu
    Min Li
    Weiqing Huang
    Cybersecurity, 4
  • [28] MULTILINGUAL END-TO-END SPEECH TRANSLATION
    Inaguma, Hirofumi
    Duh, Kevin
    Kawahara, Tatsuya
    Watanabe, Shinji
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
  • [29] An end-to-end text spotter with text relation networks
    Jiang, Jianguo
    Wei, Baole
    Yu, Min
    Li, Gang
    Li, Boquan
    Liu, Chao
    Li, Min
    Huang, Weiqing
    CYBERSECURITY, 2021, 4 (01)
  • [30] An End-to-End Scene Text Recognition for Bilingual Text
    Albalawi, Bayan M.
    Jamal, Amani T.
    Al Khuzayem, Lama A.
    Alsaedi, Olaa A.
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)