Training an End-to-End Model for Offline Handwritten Japanese Text Recognition by Generated Synthetic Patterns

被引：17

作者：

Nam Tuan Ly ^{[1
]}

Cuong Tuan Nguyen ^{[1
]}

Nakagawa, Masaki ^{[1
]}

机构：

[1] Tokyo Univ Agr & Technol, Dept Comp & Informat Sci, 2-24-16 Naka Cho, Koganei, Tokyo 1848588, Japan

来源：

PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR) | 2018年

关键词：

Handwritten Japanese Text Recognition; End-to-End Model; CNN; BLSTM; Synthetic Image Generation;

D O I：

10.1109/ICFHR-2018.2018.00022

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents an end-to-end model of Deep Convolutional Recurrent Network (DCRN) for recognizing offline handwritten Japanese text lines. The end-to-end DCRN model has three parts: a convolutional feature extractor using Deep Convolutional Neural Network (DCNN) to extract a feature sequence from a text line image; recurrent layers employing a Deep Bidirectional LSTM to predict pre-frame from the feature sequence; and a transcription layer using Connectionist Temporal Classification (CTC) to convert the pre-frame predictions into the label sequence. Since our end-to-end model requires a large data for training, we synthesize handwritten text line images from sentences in corpora and handwritten character patterns in the Nakayosi and Kuchibue database with elastic distortions. In the experiment, we evaluate the performance of the end-to-end model and the effectiveness of the synthetic data generation method on the test set of the TUAT Kondate database. The results of the experiments show that our end-to-end model achieves higher than the state-of-the-art recognition accuracy on the test set of TUAT Kondate with 96.35% and 98.05% character level recognition accuracies without and with the generated synthetic data, respectively.

引用

页码：74 / 79

页数：6

共 50 条

[41] DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone
Zhou, Di
Zhang, Jianxun
Li, Chao
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
[42] Bootstrap an End-to-end ASR System by Multilingual Training, Transfer Learning, Text-to-text Mapping and Synthetic Audio
Giollo, Manuel
Gunceler, Deniz
Liu, Yulan
Willett, Daniel
INTERSPEECH 2021, 2021, : 2416 - 2420
[43] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
Omidi, Zahra
BabaAli, Bagher
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3009 - 3020
[44] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
Zahra Omidi
Bagher BabaAli
Signal, Image and Video Processing, 2024, 18 : 3009 - 3020
[45] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
Bluche, Theodore
Louradour, Jerome
Messina, Ronaldo
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1050 - 1055
[46] An End-to-End model for Vietnamese speech recognition
Van Huy Nguyen
2019 IEEE - RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF), 2019, : 307 - 312
[47] INTERNAL LANGUAGE MODEL TRAINING FOR DOMAIN-ADAPTIVE END-TO-END SPEECH RECOGNITION
Meng, Zhong
Kanda, Naoyuki
Gaur, Yashesh
Parthasarathy, Sarangarajan
Sun, Eric
Lu, Liang
Chen, Xie
Li, Jinyu
Gong, Yifan
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7338 - 7342
[48] Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Meng, Zhong
Gaur, Yashesh
Kanda, Naoyuki
Li, Jinyu
Chen, Xie
Wu, Yu
Gong, Yifan
INTERSPEECH 2022, 2022, : 2608 - 2612
[49] Speech-and-Text Transformer: Exploiting Unpaired Text for End-to-End Speech Recognition
Wang, Qinyi
Zhou, Xinyuan
Li, Haizhou
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (01)
[50] Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion
Makhmudov, Fazliddin
Mukhiddinov, Mukhriddin
Abdusalomov, Akmalbek
Avazov, Kuldoshbay
Khamdamov, Utkir
Cho, Young Im
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (06)

← 1 2 3 4 5 →