Training an End-to-End Model for Offline Handwritten Japanese Text Recognition by Generated Synthetic Patterns

被引:17
|
作者
Nam Tuan Ly [1 ]
Cuong Tuan Nguyen [1 ]
Nakagawa, Masaki [1 ]
机构
[1] Tokyo Univ Agr & Technol, Dept Comp & Informat Sci, 2-24-16 Naka Cho, Koganei, Tokyo 1848588, Japan
关键词
Handwritten Japanese Text Recognition; End-to-End Model; CNN; BLSTM; Synthetic Image Generation;
D O I
10.1109/ICFHR-2018.2018.00022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an end-to-end model of Deep Convolutional Recurrent Network (DCRN) for recognizing offline handwritten Japanese text lines. The end-to-end DCRN model has three parts: a convolutional feature extractor using Deep Convolutional Neural Network (DCNN) to extract a feature sequence from a text line image; recurrent layers employing a Deep Bidirectional LSTM to predict pre-frame from the feature sequence; and a transcription layer using Connectionist Temporal Classification (CTC) to convert the pre-frame predictions into the label sequence. Since our end-to-end model requires a large data for training, we synthesize handwritten text line images from sentences in corpora and handwritten character patterns in the Nakayosi and Kuchibue database with elastic distortions. In the experiment, we evaluate the performance of the end-to-end model and the effectiveness of the synthetic data generation method on the test set of the TUAT Kondate database. The results of the experiments show that our end-to-end model achieves higher than the state-of-the-art recognition accuracy on the test set of TUAT Kondate with 96.35% and 98.05% character level recognition accuracies without and with the generated synthetic data, respectively.
引用
收藏
页码:74 / 79
页数:6
相关论文
共 50 条
  • [41] DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone
    Zhou, Di
    Zhang, Jianxun
    Li, Chao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
  • [42] Bootstrap an End-to-end ASR System by Multilingual Training, Transfer Learning, Text-to-text Mapping and Synthetic Audio
    Giollo, Manuel
    Gunceler, Deniz
    Liu, Yulan
    Willett, Daniel
    INTERSPEECH 2021, 2021, : 2416 - 2420
  • [43] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
    Omidi, Zahra
    BabaAli, Bagher
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3009 - 3020
  • [44] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
    Zahra Omidi
    Bagher BabaAli
    Signal, Image and Video Processing, 2024, 18 : 3009 - 3020
  • [45] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
    Bluche, Theodore
    Louradour, Jerome
    Messina, Ronaldo
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1050 - 1055
  • [46] An End-to-End model for Vietnamese speech recognition
    Van Huy Nguyen
    2019 IEEE - RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF), 2019, : 307 - 312
  • [47] INTERNAL LANGUAGE MODEL TRAINING FOR DOMAIN-ADAPTIVE END-TO-END SPEECH RECOGNITION
    Meng, Zhong
    Kanda, Naoyuki
    Gaur, Yashesh
    Parthasarathy, Sarangarajan
    Sun, Eric
    Lu, Liang
    Chen, Xie
    Li, Jinyu
    Gong, Yifan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7338 - 7342
  • [48] Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Kanda, Naoyuki
    Li, Jinyu
    Chen, Xie
    Wu, Yu
    Gong, Yifan
    INTERSPEECH 2022, 2022, : 2608 - 2612
  • [49] Speech-and-Text Transformer: Exploiting Unpaired Text for End-to-End Speech Recognition
    Wang, Qinyi
    Zhou, Xinyuan
    Li, Haizhou
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (01)
  • [50] Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion
    Makhmudov, Fazliddin
    Mukhiddinov, Mukhriddin
    Abdusalomov, Akmalbek
    Avazov, Kuldoshbay
    Khamdamov, Utkir
    Cho, Young Im
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (06)