Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

被引:0
|
作者
Bluche, Theodore [1 ]
机构
[1] A2iA SAS, 39 Rue Bienfaisance, F-75008 Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient Multi-Dimensional Long Short-Term Memory Recurrent Neural Networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can select one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM databases yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Training an End-to-End Model for Offline Handwritten Japanese Text Recognition by Generated Synthetic Patterns
    Nam Tuan Ly
    Cuong Tuan Nguyen
    Nakagawa, Masaki
    PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 74 - 79
  • [32] Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition
    Zhang, Jianshu
    Du, Jun
    Dai, Lirong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (01) : 221 - 233
  • [33] Combining CNN and Transformer as Encoder to Improve End-to-End Handwritten Mathematical Expression Recognition Accuracy
    Zhang, Zhang
    Zhang, Yibo
    FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 185 - 197
  • [34] End-To-End Deep-Learning-Based Tamil Handwritten Document Recognition and Classification Model
    Vinotheni, C.
    Pandian, S. Lakshmana
    IEEE ACCESS, 2023, 11 : 43195 - 43204
  • [35] An End-to-End Network for Panoptic Segmentation
    Liu, Huanyu
    Peng, Chao
    Yu, Changqian
    Wang, Jingbo
    Liu, Xu
    Yu, Gang
    Jiang, Wei
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6165 - 6174
  • [36] End-to-end motion segmentation and recognition with high accuracy using wearable devices
    Bingtao, Zhou
    Yuyang, Cheng
    Mian, Xiang
    ELECTRONICS LETTERS, 2022, 58 (13) : 511 - 513
  • [37] End-to-End Phoneme Recognition using Models from Semantic Image Segmentation
    Gao, Wei
    Hashemi-Sakhtsari, Ahmad
    McDonnell, Mark D.
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [38] END-TO-END TRAINING OF A LARGE VOCABULARY END-TO-END SPEECH RECOGNITION SYSTEM
    Kim, Chanwoo
    Kim, Sungsoo
    Kim, Kwangyoun
    Kumar, Mehul
    Kim, Jiyeon
    Lee, Kyungmin
    Han, Changwoo
    Garg, Abhinav
    Kim, Eunhyang
    Shin, Minkyoo
    Singh, Shatrughan
    Heck, Larry
    Gowda, Dhananjaya
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 562 - 569
  • [39] AN END-TO-END APPROACH TO JOINT SOCIAL SIGNAL DETECTION AND AUTOMATIC SPEECH RECOGNITION
    Inaguma, Hirofumi
    Mimura, Masato
    Inoue, Koji
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6214 - 6218
  • [40] LANGUAGE INDEPENDENT END-TO-END ARCHITECTURE FOR JOINT LANGUAGE IDENTIFICATION AND SPEECH RECOGNITION
    Watanabe, Shinji
    Hori, Takaaki
    Hershey, John R.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 265 - 271