Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?

被引:7
|
作者
Tueselmann, Oliver [1 ]
Wolf, Fabian [1 ]
Fink, Gernot A. [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci, D-44227 Dortmund, Germany
关键词
Named entity recognition; Document image analysis; Information retrieval; Handwritten documents; RECOGNITION;
D O I
10.1007/978-3-030-86331-9_52
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named entities (NEs) are fundamental in the extraction of information from text. The recognition and classification of these entities into predefined categories is called Named Entity Recognition (NER) and plays a major role in Natural Language Processing. However, only a few works consider this task with respect to the document image domain. The approaches are either based on a two-stage or end-to-end architecture. A two-stage approach transforms the document image into a textual representation and determines the NEs using a textual NER. The end-to-end approach, on the other hand, avoids the explicit recognition step at text level and determines the NEs directly on image level. Current approaches that try to tackle the task of NER on segmented word images use end-to-end architectures. This is motivated by the assumption that handwriting recognition is too erroneous to allow for an effective application of textual NLP methods. In this work, we present a two-stage approach and compare it against state-of-the-art end-to-end approaches. Due to the lack of datasets and evaluation protocols, such a comparison is currently difficult. Therefore, we manually annotated the known IAM and George Washington datasets with NE labels and publish them along with optimized splits and an evaluation protocol. Our experiments show, contrary to the common belief, that a two-stage model can achieve higher scores on all tested datasets.
引用
收藏
页码:808 / 822
页数:15
相关论文
共 50 条
  • [41] Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model
    Carbonell, Manuel
    Villegas, Mauricio
    Fornes, Alicia
    Llados, Josep
    2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 399 - 404
  • [42] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
    Bluche, Theodore
    Louradour, Jerome
    Messina, Ronaldo
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1050 - 1055
  • [43] End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network
    Coquenet, Denis
    Chatelain, Clement
    Paquet, Thierry
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 508 - 524
  • [44] Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline
    Chen, Yu
    Gao, Fei
    Zhang, Yanguang
    Qiao, Maoying
    Wang, Nannan
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 15675 - 15685
  • [45] Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks
    Jo, Junho
    Koo, Hyung Il
    Soh, Jae Woong
    Cho, Nam Ik
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 32137 - 32150
  • [46] A Framework for end-to-end approach to Systems Integration
    Jain R.
    Chandrasekaran A.
    Erol O.
    International Journal of Industrial and Systems Engineering, 2010, 5 (01) : 79 - 109
  • [47] On Optimum End-to-End Distortion in MIMO Systems
    Jinhui Chen
    Dirk T. M. Slock
    EURASIP Journal on Wireless Communications and Networking, 2009
  • [48] End-to-End Architecture for Adaptive Communication Systems
    Boufidis, Z.
    Alonistioti, N.
    Stamatelatos, M.
    Vogler, J.
    Luecking, U.
    Kloeck, C.
    Grandblaise, D.
    Bourse, D.
    2006 IEEE 64TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2006, : 3027 - +
  • [49] End-to-end delay analysis for networked systems
    Shen, Jie
    He, Wen-bo
    Liu, Xue
    Wang, Zhi-bo
    Wang, Zhi
    Yao, Jian-guo
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (09) : 732 - 743
  • [50] SECURE END-TO-END DELEGATIONS IN DISTRIBUTED SYSTEMS
    HARDJONO, T
    OHTA, T
    COMPUTER COMMUNICATIONS, 1994, 17 (03) : 230 - 238