Simultaneous Script Identification and Handwriting Recognition via Multi-Task Learning of Recurrent Neural Networks

被引:23
|
作者
Chen, Zhuo [1 ,2 ]
Wu, Yichao [1 ,2 ]
Yin, Pei [1 ]
Liu, Cheng-Lin [1 ,2 ]
机构
[1] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, 95 Zhongguan East Rd, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-task learning; SepMDLSTM; script identification; language identification; handwritten text recognition;
D O I
10.1109/ICDAR.2017.92
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a method for simultaneous script identification and handwritten text line recognition in multi-task learning framework. Firstly, we use Separable Multi-Dimensional Long Short-Term Memory (SepMDLSTM) to encode the input text line images based on convolutional feature extraction. Then, the extracted features are fed into two classification modules for script identification and multi-script text recognition, respectively. All the network parameters are trained end-to-end by multi-task learning where the script identification task and the text recognition task are aimed to minimize the Negative Log Likelihood (NLL) loss and Connectionist Temporal Classification (CTC) loss, respectively. We evaluated the performance of the proposed method on handwritten text line datasets of three languages, namely, IAM (English), Rimes (French) and IFN/ENIT (Arabic). Experimental results demonstrate the multi-task learning framework performs superiorly for both script identification and text recognition. Particularly, the accuracy of script identification is higher than 99.9% and the character error rate (CER) of text recognition is even lower than that of some single-script text recognition systems.
引用
收藏
页码:525 / 530
页数:6
相关论文
共 50 条
  • [41] Evolutionary Multi-task Learning for Modular Knowledge Representation in Neural Networks
    Chandra, Rohitash
    Gupta, Abhishek
    Ong, Yew-Soon
    Goh, Chi-Keong
    NEURAL PROCESSING LETTERS, 2018, 47 (03) : 993 - 1009
  • [42] Generative Neural Networks for Multi-task Life-Long Learning
    Reeder, John
    Georgiopoulos, Michael
    COMPUTER JOURNAL, 2014, 57 (03): : 427 - 450
  • [43] Evolutionary Multi-task Learning for Modular Knowledge Representation in Neural Networks
    Rohitash Chandra
    Abhishek Gupta
    Yew-Soon Ong
    Chi-Keong Goh
    Neural Processing Letters, 2018, 47 : 993 - 1009
  • [44] A Deep Neural Networks Based on Multi-task Learning and Its Application
    Zhao, Mengru
    Zhang, Yuxian
    Qiao, Likui
    Sun, Deyuan
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 6201 - 6206
  • [45] Multi-task Learning Deep Neural Networks For Speech Feature Denoising
    Huang, Bin
    Ke, Dengfeng
    Zheng, Hao
    Xu, Bo
    Xu, Yanyan
    Su, Kaile
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2464 - 2468
  • [46] MULTI-TASK LEARNING FOR SEGMENTATION OF BUILDING FOOTPRINTS WITH DEEP NEURAL NETWORKS
    Bischke, Benjamin
    Helber, Patrick
    Folz, Joachim
    Borth, Damian
    Dengel, Andreas
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1480 - 1484
  • [47] Multi-Task Learning via Time-Aware Neural ODE
    Ye, Feiyang
    Wang, Xuehao
    Zhang, Yu
    Tsang, Ivor W.
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4495 - 4503
  • [48] Multi-task Neural Networks for Personalized Pain Recognition from Physiological Signals
    Lopez-Martinez, Daniel
    Picard, Rosalind
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2017, : 181 - 184
  • [49] Multi-task Recurrent Model for Speech and Speaker Recognition
    Tang, Zhiyuan
    Li, Lantian
    Wang, Dong
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [50] Discretized Continuous Speech Emotion Recognition with Multi-Task Deep Recurrent Neural Network
    Duc Le
    Aldeneh, Zakaria
    Provost, Emily Mower
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1108 - 1112