Simultaneous Script Identification and Handwriting Recognition via Multi-Task Learning of Recurrent Neural Networks

被引：23

作者：

Chen, Zhuo ^{[1
,2
]}

Wu, Yichao ^{[1
,2
]}

Yin, Pei ^{[1
]}

Liu, Cheng-Lin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguan East Rd, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1 | 2017年

基金：

中国国家自然科学基金;

关键词：

multi-task learning; SepMDLSTM; script identification; language identification; handwritten text recognition;

D O I：

10.1109/ICDAR.2017.92

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a method for simultaneous script identification and handwritten text line recognition in multi-task learning framework. Firstly, we use Separable Multi-Dimensional Long Short-Term Memory (SepMDLSTM) to encode the input text line images based on convolutional feature extraction. Then, the extracted features are fed into two classification modules for script identification and multi-script text recognition, respectively. All the network parameters are trained end-to-end by multi-task learning where the script identification task and the text recognition task are aimed to minimize the Negative Log Likelihood (NLL) loss and Connectionist Temporal Classification (CTC) loss, respectively. We evaluated the performance of the proposed method on handwritten text line datasets of three languages, namely, IAM (English), Rimes (French) and IFN/ENIT (Arabic). Experimental results demonstrate the multi-task learning framework performs superiorly for both script identification and text recognition. Particularly, the accuracy of script identification is higher than 99.9% and the character error rate (CER) of text recognition is even lower than that of some single-script text recognition systems.

引用

页码：525 / 530

页数：6

共 50 条

[1] Multi-task learning for simultaneous script identification and keyword spotting in document images
Cheikhrouhou, Ahmed
Kessentini, Yousri
Kanoun, Slim
PATTERN RECOGNITION, 2021, 113
[2] MuLTReNets: Multilingual text recognition networks for simultaneous script identification and handwriting recognition
Chen, Zhuo
Yin, Fei
Zhang, Xu-Yao
Yang, Qing
Liu, Cheng-Lin
PATTERN RECOGNITION, 2020, 108 (108)
[3] MULTI-TASK LEARNING IN DEEP NEURAL NETWORKS FOR IMPROVED PHONEME RECOGNITION
Seltzer, Michael L.
Droppo, Jasha
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6965 - 6969
[4] Multi-script handwritten digit recognition using multi-task learning
Gondere, Mesay Samuel
Schmidt-Thieme, Lars
Sharma, Durga Prasad
Scholz, Randolf
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (01) : 355 - 364
[5] Integrated Perception with Recurrent Multi-Task Neural Networks
Bilen, Hakan
Vedaldi, Andrea
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[6] Convex Multi-Task Learning with Neural Networks
Ruiz, Carlos
Alaiz, Carlos M.
Dorronsoro, Jose R.
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022, 2022, 13469 : 223 - 235
[7] Adversarial Multi-task Learning of Deep Neural Networks for Robust Speech Recognition
Shinohara, Yusuke
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2369 - 2372
[8] Multi-task Neural Networks Convolutional Learning Model for Maize Disease Identification
Niyomwungere, Diane
Mwangi, Waweru
Rimiru, Richard
2022 IST-AFRICA CONFERENCE, 2022,
[9] Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks
Zhang, Xi-Jin
Lu, Yi-Fan
Zhang, Song-Hai
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (03) : 489 - 500
[10] Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks
Xi-Jin Zhang
Yi-Fan Lu
Song-Hai Zhang
Journal of Computer Science and Technology, 2016, 31 : 489 - 500

← 1 2 3 4 5 →