Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

被引:0
|
作者
Bluche, Theodore [1 ]
机构
[1] A2iA SAS, 39 Rue Bienfaisance, F-75008 Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient Multi-Dimensional Long Short-Term Memory Recurrent Neural Networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can select one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM databases yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Lattice Based Transcription Loss for End-to-End Speech Recognition
    Kang, Jian
    Zhang, Wei-Qiang
    Liu, Wei-Wei
    Liu, Jia
    Johnson, Michael T.
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 1013 - 1023
  • [22] Lattice Based Transcription Loss for End-to-End Speech Recognition
    Jian Kang
    Wei-Qiang Zhang
    Wei-Wei Liu
    Jia Liu
    Michael T. Johnson
    Journal of Signal Processing Systems, 2018, 90 : 1013 - 1023
  • [23] Joint CTC/attention decoding for end-to-end speech recognition
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
  • [24] End-to-end attention convolutional recurrent network for online handwritten Chinese text recognition
    Qu, Xiwen
    Wu, Zhihong
    Huang, Jun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (23) : 62541 - 62558
  • [25] FPRNet: End-to-End Full-Page Recognition Model for Handwritten Chinese Essay
    Su, Tonghua
    You, Hongming
    Liu, Shuchen
    Wang, Zhongjie
    FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 231 - 244
  • [26] Improvement of End-to-End Offline Handwritten Mathematical Expression Recognition by Weakly Supervised Learning
    Thanh-Nghia Truong
    Cuong Tuan Nguyen
    Khanh Minh Phan
    Nakagawa, Masaki
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 181 - 186
  • [27] End-to-end handwritten Ge'ez multiple numerals recognition using deep learning
    Malhotra, Ruchika
    Addis, Maru Tesfaye
    SICE JOURNAL OF CONTROL MEASUREMENT AND SYSTEM INTEGRATION, 2024, 17 (01) : 122 - 134
  • [28] Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification
    Zhang, C.
    Li, B.
    Sainath, T. N.
    Strohman, T.
    Mavandadi, S.
    Chang, S.
    Haghani, P.
    INTERSPEECH 2022, 2022, : 3223 - 3227
  • [29] JOINT PHONEME-GRAPHEME MODEL FOR END-TO-END SPEECH RECOGNITION
    Kubo, Yotaro
    Bacchiani, Michiel
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6119 - 6123
  • [30] An End-to-End Classifier Based on CNN for In-Air Handwritten-Chinese-Character Recognition
    Hu, Mianjun
    Qu, Xiwen
    Huang, Jun
    Wu, Xuangou
    APPLIED SCIENCES-BASEL, 2022, 12 (14):