End-to-End Handwritten Text Detection and Transcription in Full Pages

被引:14
|
作者
Carbonell, Manuel [1 ]
Mas, Joan [2 ]
Villegas, Mauricio [1 ]
Fornes, Alicia [2 ]
Llados, Josep [2 ]
机构
[1] Omni Us, Berlin, Germany
[2] Comp Vis Ctr, Barcelona, Spain
关键词
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;
D O I
10.1109/ICDARW.2019.40077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 50 条
  • [41] BiPass: Enabling End-to-End Full Duplex
    Chen, Lu
    Wu, Fei
    Xu, Jiaqi
    Srinivasan, Kannan
    Shroff, Ness
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (MOBICOM '17), 2017, : 114 - 126
  • [42] An End-to-End Approach for Recognition of Modern and Historical Handwritten Numeral Strings
    Hochuli, Andre G.
    Britto, Alceu S., Jr.
    Barddal, Jean P.
    Oliveira, Luiz E. S.
    Sabourin, Robert
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [43] Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?
    Tueselmann, Oliver
    Wolf, Fabian
    Fink, Gernot A.
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 808 - 822
  • [44] End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Documents
    Shtaiwi, Reem E.
    Abandah, Gheith A.
    Sawalhah, Safaa A.
    2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 180 - 185
  • [45] A comprehensive comparison of end-to-end approaches for handwritten digit string recognition
    Hochuli, Andre G.
    Britto Jr, Alceu S.
    Saji, David A.
    Saavedra, Jose M.
    Sabourin, Robert
    Oliveira, Luiz S.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165 (165)
  • [46] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
    Ren, Yi
    Liu, Jinglin
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796
  • [47] Towards End-to-End Speech-to-Text Summarization
    Monteiro, Raul
    Pernes, Diogo
    TEXT, SPEECH, AND DIALOGUE, TSD 2023, 2023, 14102 : 304 - 316
  • [48] End-to-end Speech-to-Punctuated-Text Recognition
    Nozaki, Jumon
    Kawahara, Tatsuya
    Ishizuka, Kenkichi
    Hashimoto, Taiichi
    INTERSPEECH 2022, 2022, : 1811 - 1815
  • [49] A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Ney, Hermann
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 792 - 799
  • [50] End-to-End Speech Synthesis for Bangla with Text Normalization
    Pial, Tanzir Islam
    Aunti, Shahreen Salim
    Ahmed, Shabbir
    Heickal, Hasnain
    2018 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE/ INTELLIGENCE AND APPLIED INFORMATICS (CSII 2018), 2018, : 66 - 71