Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?

被引:7
|
作者
Tueselmann, Oliver [1 ]
Wolf, Fabian [1 ]
Fink, Gernot A. [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci, D-44227 Dortmund, Germany
关键词
Named entity recognition; Document image analysis; Information retrieval; Handwritten documents; RECOGNITION;
D O I
10.1007/978-3-030-86331-9_52
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named entities (NEs) are fundamental in the extraction of information from text. The recognition and classification of these entities into predefined categories is called Named Entity Recognition (NER) and plays a major role in Natural Language Processing. However, only a few works consider this task with respect to the document image domain. The approaches are either based on a two-stage or end-to-end architecture. A two-stage approach transforms the document image into a textual representation and determines the NEs using a textual NER. The end-to-end approach, on the other hand, avoids the explicit recognition step at text level and determines the NEs directly on image level. Current approaches that try to tackle the task of NER on segmented word images use end-to-end architectures. This is motivated by the assumption that handwriting recognition is too erroneous to allow for an effective application of textual NLP methods. In this work, we present a two-stage approach and compare it against state-of-the-art end-to-end approaches. Due to the lack of datasets and evaluation protocols, such a comparison is currently difficult. Therefore, we manually annotated the known IAM and George Washington datasets with NE labels and publish them along with optimized splits and an evaluation protocol. Our experiments show, contrary to the common belief, that a two-stage model can achieve higher scores on all tested datasets.
引用
收藏
页码:808 / 822
页数:15
相关论文
共 50 条
  • [31] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
    Omidi, Zahra
    BabaAli, Bagher
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3009 - 3020
  • [32] End-to-End diagnosis of breast biopsy images with transformers
    Mehta, Sachin
    Lu, Ximing
    Wu, Wenjun
    Weaver, Donald
    Hajishirzi, Hannaneh
    Elmore, Joann G.
    Shapiro, Linda G.
    MEDICAL IMAGE ANALYSIS, 2022, 79
  • [33] MFCNET: END-TO-END APPROACH FOR CHANGE DETECTION IN IMAGES
    Chen, Ying
    Xu Ouyang
    Agam, Gady
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4008 - 4012
  • [34] Training an End-to-End System for Handwritten Mathematical Expression Recognition by Generated Patterns
    Anh Duc Le
    Nakagawa, Masaki
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1056 - 1061
  • [35] An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts
    Chammas M.
    Makhoul A.
    Demerjian J.
    Dannaoui E.
    Multimedia Tools and Applications, 2024, 83 (18) : 54569 - 54589
  • [36] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Junho Jo
    Hyung Il Koo
    Jae Woong Soh
    Nam Ik Cho
    Multimedia Tools and Applications, 2020, 79 : 32137 - 32150
  • [37] Volumetric End-to-End Optimized Compression for Brain Images
    Gao, Shuo
    Zhang, Yueyi
    Liu, Dong
    Xiong, Zhiwei
    2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, : 503 - 506
  • [38] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
    Zahra Omidi
    Bagher BabaAli
    Signal, Image and Video Processing, 2024, 18 : 3009 - 3020
  • [39] End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
    Malhotra, Ruchika
    Addis, Maru Tesfaye
    IEEE ACCESS, 2023, 11 : 99535 - 99545
  • [40] An end-to-end handwritten text recognition method using residual attention networks
    Wang Y.-T.
    Zheng H.
    Chang H.-Y.
    Li S.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (07): : 1825 - 1834